Overview

Brought to you by YData

Dataset statistics

Number of variables88
Number of observations290898
Missing cells6184922
Missing cells (%)24.2%
Total size in memory195.3 MiB
Average record size in memory704.0 B

Variable types

Text88

Dataset

DescriptionNaturalis Biodiversity Center (NL) - Aves 0061686-241126133413365
URLhttps://doi.org/10.15468/dl.u5tv27

Alerts

license has constant value "CC0_1_0" Constant
publisher has constant value "Naturalis Biodiversity Center" Constant
rightsHolder has constant value "Naturalis Biodiversity Center" Constant
institutionID has constant value "https://ror.org/0566bfb96" Constant
collectionCode has constant value "Aves" Constant
basisOfRecord has constant value "PRESERVED_SPECIMEN" Constant
occurrenceStatus has constant value "PRESENT" Constant
associatedTaxa has constant value "has parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp." Constant
nomenclaturalCode has constant value "ICZN" Constant
datasetKey has constant value "889c91a3-614f-4355-8df8-b6d0260a118c" Constant
publishingCountry has constant value "NL" Constant
protocol has constant value "DWC_ARCHIVE" Constant
lastCrawled has constant value "2025-01-03T11:34:30.428Z" Constant
isSequenced has constant value "false" Constant
publishedByGbifRegion has constant value "EUROPE" Constant
recordNumber has 277608 (95.4%) missing values Missing
recordedBy has 93217 (32.0%) missing values Missing
individualCount has 30538 (10.5%) missing values Missing
sex has 98571 (33.9%) missing values Missing
lifeStage has 210308 (72.3%) missing values Missing
associatedTaxa has 290895 (> 99.9%) missing values Missing
eventDate has 74430 (25.6%) missing values Missing
startDayOfYear has 74430 (25.6%) missing values Missing
endDayOfYear has 74430 (25.6%) missing values Missing
year has 78830 (27.1%) missing values Missing
month has 87276 (30.0%) missing values Missing
day has 101613 (34.9%) missing values Missing
verbatimEventDate has 59902 (20.6%) missing values Missing
continent has 94391 (32.4%) missing values Missing
island has 200600 (69.0%) missing values Missing
countryCode has 47203 (16.2%) missing values Missing
stateProvince has 137182 (47.2%) missing values Missing
locality has 79647 (27.4%) missing values Missing
verbatimElevation has 288311 (99.1%) missing values Missing
decimalLatitude has 139112 (47.8%) missing values Missing
decimalLongitude has 139112 (47.8%) missing values Missing
coordinateUncertaintyInMeters has 289239 (99.4%) missing values Missing
typeStatus has 287427 (98.8%) missing values Missing
identifiedBy has 290486 (99.9%) missing values Missing
dateIdentified has 290641 (99.9%) missing values Missing
specificEpithet has 10799 (3.7%) missing values Missing
infraspecificEpithet has 125699 (43.2%) missing values Missing
distanceFromCentroidInMeters has 289238 (99.4%) missing values Missing
mediaType has 207500 (71.3%) missing values Missing
speciesKey has 10568 (3.6%) missing values Missing
species has 10568 (3.6%) missing values Missing
repatriated has 46939 (16.1%) missing values Missing
gbifRegion has 50475 (17.4%) missing values Missing
level0Gid has 158562 (54.5%) missing values Missing
level0Name has 158562 (54.5%) missing values Missing
level1Gid has 159606 (54.9%) missing values Missing
level1Name has 159606 (54.9%) missing values Missing
level2Gid has 161386 (55.5%) missing values Missing
level2Name has 161392 (55.5%) missing values Missing
level3Gid has 227914 (78.3%) missing values Missing
level3Name has 229332 (78.8%) missing values Missing
iucnRedListCategory has 167789 (57.7%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-08 23:40:02.757842
Analysis finished2025-01-08 23:40:11.593245
Duration8.84 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct290898
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:11.860496image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2908980
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique290898 ?
Unique (%)100.0%

Sample

1st row2434047501
2nd row2434047502
3rd row2434047503
4th row2434047504
5th row2434047505
ValueCountFrequency (%)
2434047501 1
 
< 0.1%
2433858690 1
 
< 0.1%
2434047505 1
 
< 0.1%
2434047506 1
 
< 0.1%
2434047507 1
 
< 0.1%
2434047508 1
 
< 0.1%
2434047523 1
 
< 0.1%
2434047509 1
 
< 0.1%
2433858683 1
 
< 0.1%
2434047503 1
 
< 0.1%
Other values (290888) 290888
> 99.9%
2025-01-08T18:40:12.218216image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 648315
22.3%
3 508615
17.5%
2 477380
16.4%
1 244327
 
8.4%
0 214623
 
7.4%
9 195309
 
6.7%
8 174998
 
6.0%
7 151616
 
5.2%
5 148917
 
5.1%
6 144880
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2908980
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 648315
22.3%
3 508615
17.5%
2 477380
16.4%
1 244327
 
8.4%
0 214623
 
7.4%
9 195309
 
6.7%
8 174998
 
6.0%
7 151616
 
5.2%
5 148917
 
5.1%
6 144880
 
5.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2908980
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 648315
22.3%
3 508615
17.5%
2 477380
16.4%
1 244327
 
8.4%
0 214623
 
7.4%
9 195309
 
6.7%
8 174998
 
6.0%
7 151616
 
5.2%
5 148917
 
5.1%
6 144880
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2908980
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 648315
22.3%
3 508615
17.5%
2 477380
16.4%
1 244327
 
8.4%
0 214623
 
7.4%
9 195309
 
6.7%
8 174998
 
6.0%
7 151616
 
5.2%
5 148917
 
5.1%
6 144880
 
5.0%

license
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:12.271113image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters2036286
Distinct characters4
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCC0_1_0
2nd rowCC0_1_0
3rd rowCC0_1_0
4th rowCC0_1_0
5th rowCC0_1_0
ValueCountFrequency (%)
cc0_1_0 290898
100.0%
2025-01-08T18:40:12.364907image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 581796
28.6%
0 581796
28.6%
_ 581796
28.6%
1 290898
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 872694
42.9%
Uppercase Letter 581796
28.6%
Connector Punctuation 581796
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 581796
66.7%
1 290898
33.3%
Uppercase Letter
ValueCountFrequency (%)
C 581796
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 581796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1454490
71.4%
Latin 581796
 
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 581796
40.0%
_ 581796
40.0%
1 290898
20.0%
Latin
ValueCountFrequency (%)
C 581796
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2036286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 581796
28.6%
0 581796
28.6%
_ 581796
28.6%
1 290898
14.3%
Distinct1170
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:12.539927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length20
Mean length20
Min length20

Characters and Unicode

Total characters5817960
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique229 ?
Unique (%)0.1%

Sample

1st row2015-06-05T00:00:00Z
2nd row2023-05-16T00:00:00Z
3rd row2015-09-02T00:00:00Z
4th row2017-07-01T00:00:00Z
5th row2015-05-23T00:00:00Z
ValueCountFrequency (%)
2017-06-30t00:00:00z 48811
16.8%
2023-05-16t00:00:00z 41000
14.1%
2017-07-01t00:00:00z 26280
 
9.0%
2015-05-23t00:00:00z 17611
 
6.1%
2015-07-03t00:00:00z 13223
 
4.5%
2015-05-18t00:00:00z 11421
 
3.9%
2015-07-01t00:00:00z 10549
 
3.6%
2015-06-24t00:00:00z 9657
 
3.3%
2015-07-02t00:00:00z 9646
 
3.3%
2015-06-23t00:00:00z 9602
 
3.3%
Other values (1160) 93098
32.0%
2025-01-08T18:40:12.806859image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2479444
42.6%
- 581796
 
10.0%
: 581796
 
10.0%
2 488823
 
8.4%
1 370421
 
6.4%
T 290898
 
5.0%
Z 290898
 
5.0%
5 236072
 
4.1%
3 147699
 
2.5%
6 142899
 
2.5%
Other values (4) 207214
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4072572
70.0%
Dash Punctuation 581796
 
10.0%
Other Punctuation 581796
 
10.0%
Uppercase Letter 581796
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2479444
60.9%
2 488823
 
12.0%
1 370421
 
9.1%
5 236072
 
5.8%
3 147699
 
3.6%
6 142899
 
3.5%
7 140323
 
3.4%
8 26588
 
0.7%
9 21411
 
0.5%
4 18892
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 581796
100.0%
Other Punctuation
ValueCountFrequency (%)
: 581796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5236164
90.0%
Latin 581796
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2479444
47.4%
- 581796
 
11.1%
: 581796
 
11.1%
2 488823
 
9.3%
1 370421
 
7.1%
5 236072
 
4.5%
3 147699
 
2.8%
6 142899
 
2.7%
7 140323
 
2.7%
8 26588
 
0.5%
Other values (2) 40303
 
0.8%
Latin
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5817960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2479444
42.6%
- 581796
 
10.0%
: 581796
 
10.0%
2 488823
 
8.4%
1 370421
 
6.4%
T 290898
 
5.0%
Z 290898
 
5.0%
5 236072
 
4.1%
3 147699
 
2.5%
6 142899
 
2.5%
Other values (4) 207214
 
3.6%

publisher
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:12.866589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters8436042
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNaturalis Biodiversity Center
2nd rowNaturalis Biodiversity Center
3rd rowNaturalis Biodiversity Center
4th rowNaturalis Biodiversity Center
5th rowNaturalis Biodiversity Center
ValueCountFrequency (%)
naturalis 290898
33.3%
biodiversity 290898
33.3%
center 290898
33.3%
2025-01-08T18:40:12.967904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1163592
13.8%
t 872694
10.3%
r 872694
10.3%
e 872694
10.3%
581796
 
6.9%
s 581796
 
6.9%
a 581796
 
6.9%
d 290898
 
3.4%
C 290898
 
3.4%
y 290898
 
3.4%
Other values (7) 2036286
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6981552
82.8%
Uppercase Letter 872694
 
10.3%
Space Separator 581796
 
6.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1163592
16.7%
t 872694
12.5%
r 872694
12.5%
e 872694
12.5%
s 581796
8.3%
a 581796
8.3%
d 290898
 
4.2%
y 290898
 
4.2%
v 290898
 
4.2%
o 290898
 
4.2%
Other values (3) 872694
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 290898
33.3%
N 290898
33.3%
B 290898
33.3%
Space Separator
ValueCountFrequency (%)
581796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7854246
93.1%
Common 581796
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1163592
14.8%
t 872694
11.1%
r 872694
11.1%
e 872694
11.1%
s 581796
 
7.4%
a 581796
 
7.4%
d 290898
 
3.7%
C 290898
 
3.7%
y 290898
 
3.7%
v 290898
 
3.7%
Other values (6) 1745388
22.2%
Common
ValueCountFrequency (%)
581796
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8436042
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1163592
13.8%
t 872694
10.3%
r 872694
10.3%
e 872694
10.3%
581796
 
6.9%
s 581796
 
6.9%
a 581796
 
6.9%
d 290898
 
3.4%
C 290898
 
3.4%
y 290898
 
3.4%
Other values (7) 2036286
24.1%

rightsHolder
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:13.018905image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters8436042
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNaturalis Biodiversity Center
2nd rowNaturalis Biodiversity Center
3rd rowNaturalis Biodiversity Center
4th rowNaturalis Biodiversity Center
5th rowNaturalis Biodiversity Center
ValueCountFrequency (%)
naturalis 290898
33.3%
biodiversity 290898
33.3%
center 290898
33.3%
2025-01-08T18:40:13.117819image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1163592
13.8%
t 872694
10.3%
r 872694
10.3%
e 872694
10.3%
581796
 
6.9%
s 581796
 
6.9%
a 581796
 
6.9%
d 290898
 
3.4%
C 290898
 
3.4%
y 290898
 
3.4%
Other values (7) 2036286
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6981552
82.8%
Uppercase Letter 872694
 
10.3%
Space Separator 581796
 
6.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1163592
16.7%
t 872694
12.5%
r 872694
12.5%
e 872694
12.5%
s 581796
8.3%
a 581796
8.3%
d 290898
 
4.2%
y 290898
 
4.2%
v 290898
 
4.2%
o 290898
 
4.2%
Other values (3) 872694
12.5%
Uppercase Letter
ValueCountFrequency (%)
C 290898
33.3%
N 290898
33.3%
B 290898
33.3%
Space Separator
ValueCountFrequency (%)
581796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7854246
93.1%
Common 581796
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1163592
14.8%
t 872694
11.1%
r 872694
11.1%
e 872694
11.1%
s 581796
 
7.4%
a 581796
 
7.4%
d 290898
 
3.7%
C 290898
 
3.7%
y 290898
 
3.7%
v 290898
 
3.7%
Other values (6) 1745388
22.2%
Common
ValueCountFrequency (%)
581796
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8436042
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1163592
13.8%
t 872694
10.3%
r 872694
10.3%
e 872694
10.3%
581796
 
6.9%
s 581796
 
6.9%
a 581796
 
6.9%
d 290898
 
3.4%
C 290898
 
3.4%
y 290898
 
3.4%
Other values (7) 2036286
24.1%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:13.167222image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length25
Mean length25
Min length25

Characters and Unicode

Total characters7272450
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhttps://ror.org/0566bfb96
2nd rowhttps://ror.org/0566bfb96
3rd rowhttps://ror.org/0566bfb96
4th rowhttps://ror.org/0566bfb96
5th rowhttps://ror.org/0566bfb96
ValueCountFrequency (%)
https://ror.org/0566bfb96 290898
100.0%
2025-01-08T18:40:13.266244image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 872694
12.0%
r 872694
12.0%
6 872694
12.0%
t 581796
 
8.0%
o 581796
 
8.0%
b 581796
 
8.0%
h 290898
 
4.0%
p 290898
 
4.0%
s 290898
 
4.0%
: 290898
 
4.0%
Other values (6) 1745388
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4072572
56.0%
Decimal Number 1745388
24.0%
Other Punctuation 1454490
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 872694
21.4%
t 581796
14.3%
o 581796
14.3%
b 581796
14.3%
h 290898
 
7.1%
p 290898
 
7.1%
s 290898
 
7.1%
g 290898
 
7.1%
f 290898
 
7.1%
Decimal Number
ValueCountFrequency (%)
6 872694
50.0%
0 290898
 
16.7%
5 290898
 
16.7%
9 290898
 
16.7%
Other Punctuation
ValueCountFrequency (%)
/ 872694
60.0%
: 290898
 
20.0%
. 290898
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4072572
56.0%
Common 3199878
44.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 872694
21.4%
t 581796
14.3%
o 581796
14.3%
b 581796
14.3%
h 290898
 
7.1%
p 290898
 
7.1%
s 290898
 
7.1%
g 290898
 
7.1%
f 290898
 
7.1%
Common
ValueCountFrequency (%)
/ 872694
27.3%
6 872694
27.3%
: 290898
 
9.1%
. 290898
 
9.1%
0 290898
 
9.1%
5 290898
 
9.1%
9 290898
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7272450
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 872694
12.0%
r 872694
12.0%
6 872694
12.0%
t 581796
 
8.0%
o 581796
 
8.0%
b 581796
 
8.0%
h 290898
 
4.0%
p 290898
 
4.0%
s 290898
 
4.0%
: 290898
 
4.0%
Other values (6) 1745388
24.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:13.305243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1163592
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAves
2nd rowAves
3rd rowAves
4th rowAves
5th rowAves
ValueCountFrequency (%)
aves 290898
100.0%
2025-01-08T18:40:13.396377image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 290898
25.0%
v 290898
25.0%
e 290898
25.0%
s 290898
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 872694
75.0%
Uppercase Letter 290898
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
v 290898
33.3%
e 290898
33.3%
s 290898
33.3%
Uppercase Letter
ValueCountFrequency (%)
A 290898
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1163592
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 290898
25.0%
v 290898
25.0%
e 290898
25.0%
s 290898
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1163592
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 290898
25.0%
v 290898
25.0%
e 290898
25.0%
s 290898
25.0%

basisOfRecord
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:13.445324image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters5236164
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESERVED_SPECIMEN
2nd rowPRESERVED_SPECIMEN
3rd rowPRESERVED_SPECIMEN
4th rowPRESERVED_SPECIMEN
5th rowPRESERVED_SPECIMEN
ValueCountFrequency (%)
preserved_specimen 290898
100.0%
2025-01-08T18:40:13.545690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 1454490
27.8%
P 581796
 
11.1%
R 581796
 
11.1%
S 581796
 
11.1%
V 290898
 
5.6%
D 290898
 
5.6%
_ 290898
 
5.6%
C 290898
 
5.6%
I 290898
 
5.6%
M 290898
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4945266
94.4%
Connector Punctuation 290898
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 1454490
29.4%
P 581796
 
11.8%
R 581796
 
11.8%
S 581796
 
11.8%
V 290898
 
5.9%
D 290898
 
5.9%
C 290898
 
5.9%
I 290898
 
5.9%
M 290898
 
5.9%
N 290898
 
5.9%
Connector Punctuation
ValueCountFrequency (%)
_ 290898
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4945266
94.4%
Common 290898
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 1454490
29.4%
P 581796
 
11.8%
R 581796
 
11.8%
S 581796
 
11.8%
V 290898
 
5.9%
D 290898
 
5.9%
C 290898
 
5.9%
I 290898
 
5.9%
M 290898
 
5.9%
N 290898
 
5.9%
Common
ValueCountFrequency (%)
_ 290898
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5236164
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 1454490
27.8%
P 581796
 
11.1%
R 581796
 
11.1%
S 581796
 
11.1%
V 290898
 
5.6%
D 290898
 
5.6%
_ 290898
 
5.6%
C 290898
 
5.6%
I 290898
 
5.6%
M 290898
 
5.6%

occurrenceID
Text

Unique 

Distinct290898
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:13.734036image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length77
Median length71
Mean length67.20245241
Min length62

Characters and Unicode

Total characters19549059
Distinct characters44
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique290898 ?
Unique (%)100.0%

Sample

1st rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.2
2nd rowhttps://data.biodiversitydata.nl/naturalis/specimen/RMNH.AVES.4
3rd rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.18
4th rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.27
5th rowhttps://data.biodiversitydata.nl/naturalis/specimen/ZMA.AVES.36
ValueCountFrequency (%)
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.2 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/rmnh.5069558 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.36 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.45 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.54 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.72 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.222 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.81 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/rmnh.5069738 1
 
< 0.1%
https://data.biodiversitydata.nl/naturalis/specimen/zma.aves.18 1
 
< 0.1%
Other values (290888) 290888
> 99.9%
2025-01-08T18:40:13.996619image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1748359
 
8.9%
t 1745388
 
8.9%
/ 1454490
 
7.4%
i 1454490
 
7.4%
. 1171254
 
6.0%
s 1163592
 
6.0%
d 872773
 
4.5%
e 872704
 
4.5%
n 872694
 
4.5%
l 581796
 
3.0%
Other values (34) 7611519
38.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12806742
65.5%
Other Punctuation 2916642
 
14.9%
Uppercase Letter 2257505
 
11.5%
Decimal Number 1568170
 
8.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1748359
13.7%
t 1745388
13.6%
i 1454490
11.4%
s 1163592
9.1%
d 872773
 
6.8%
e 872704
 
6.8%
n 872694
 
6.8%
l 581796
 
4.5%
p 581796
 
4.5%
r 581796
 
4.5%
Other values (9) 2331354
18.2%
Uppercase Letter
ValueCountFrequency (%)
A 353920
15.7%
M 290897
12.9%
E 289127
12.8%
S 289126
12.8%
V 289126
12.8%
R 226103
10.0%
N 226103
10.0%
H 226103
10.0%
Z 64794
 
2.9%
P 2204
 
0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 229669
14.6%
2 204495
13.0%
5 154944
9.9%
3 152835
9.7%
4 146877
9.4%
6 140057
8.9%
0 137531
8.8%
7 135756
8.7%
8 134201
8.6%
9 131805
8.4%
Other Punctuation
ValueCountFrequency (%)
/ 1454490
49.9%
. 1171254
40.2%
: 290898
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15064247
77.1%
Common 4484812
 
22.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1748359
 
11.6%
t 1745388
 
11.6%
i 1454490
 
9.7%
s 1163592
 
7.7%
d 872773
 
5.8%
e 872704
 
5.8%
n 872694
 
5.8%
l 581796
 
3.9%
p 581796
 
3.9%
r 581796
 
3.9%
Other values (21) 4588859
30.5%
Common
ValueCountFrequency (%)
/ 1454490
32.4%
. 1171254
26.1%
: 290898
 
6.5%
1 229669
 
5.1%
2 204495
 
4.6%
5 154944
 
3.5%
3 152835
 
3.4%
4 146877
 
3.3%
6 140057
 
3.1%
0 137531
 
3.1%
Other values (3) 401762
 
9.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19549059
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1748359
 
8.9%
t 1745388
 
8.9%
/ 1454490
 
7.4%
i 1454490
 
7.4%
. 1171254
 
6.0%
s 1163592
 
6.0%
d 872773
 
4.5%
e 872704
 
4.5%
n 872694
 
4.5%
l 581796
 
3.0%
Other values (34) 7611519
38.9%

catalogNumber
Text

Unique 

Distinct290898
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:14.246157image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length19
Mean length15.20245241
Min length10

Characters and Unicode

Total characters4422363
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique290898 ?
Unique (%)100.0%

Sample

1st rowZMA.AVES.2
2nd rowRMNH.AVES.4
3rd rowZMA.AVES.18
4th rowZMA.AVES.27
5th rowZMA.AVES.36
ValueCountFrequency (%)
zma.aves.2 1
 
< 0.1%
rmnh.5069558 1
 
< 0.1%
zma.aves.36 1
 
< 0.1%
zma.aves.45 1
 
< 0.1%
zma.aves.54 1
 
< 0.1%
zma.aves.72 1
 
< 0.1%
zma.aves.222 1
 
< 0.1%
zma.aves.81 1
 
< 0.1%
rmnh.5069738 1
 
< 0.1%
zma.aves.18 1
 
< 0.1%
Other values (290888) 290888
> 99.9%
2025-01-08T18:40:14.563600image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 589458
13.3%
A 353920
 
8.0%
M 290897
 
6.6%
E 289127
 
6.5%
V 289126
 
6.5%
S 289126
 
6.5%
1 229669
 
5.2%
N 226103
 
5.1%
R 226103
 
5.1%
H 226103
 
5.1%
Other values (21) 1412731
31.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2257505
51.0%
Decimal Number 1568170
35.5%
Other Punctuation 589458
 
13.3%
Lowercase Letter 7230
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 353920
15.7%
M 290897
12.9%
E 289127
12.8%
V 289126
12.8%
S 289126
12.8%
N 226103
10.0%
R 226103
10.0%
H 226103
10.0%
Z 64794
 
2.9%
P 2204
 
0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 229669
14.6%
2 204495
13.0%
5 154944
9.9%
3 152835
9.7%
4 146877
9.4%
6 140057
8.9%
0 137531
8.8%
7 135756
8.7%
8 134201
8.6%
9 131805
8.4%
Lowercase Letter
ValueCountFrequency (%)
b 2993
41.4%
a 2971
41.1%
c 1060
 
14.7%
x 106
 
1.5%
d 79
 
1.1%
e 10
 
0.1%
y 10
 
0.1%
v 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 589458
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2264735
51.2%
Common 2157628
48.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 353920
15.6%
M 290897
12.8%
E 289127
12.8%
V 289126
12.8%
S 289126
12.8%
N 226103
10.0%
R 226103
10.0%
H 226103
10.0%
Z 64794
 
2.9%
b 2993
 
0.1%
Other values (10) 6443
 
0.3%
Common
ValueCountFrequency (%)
. 589458
27.3%
1 229669
 
10.6%
2 204495
 
9.5%
5 154944
 
7.2%
3 152835
 
7.1%
4 146877
 
6.8%
6 140057
 
6.5%
0 137531
 
6.4%
7 135756
 
6.3%
8 134201
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4422363
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 589458
13.3%
A 353920
 
8.0%
M 290897
 
6.6%
E 289127
 
6.5%
V 289126
 
6.5%
S 289126
 
6.5%
1 229669
 
5.2%
N 226103
 
5.1%
R 226103
 
5.1%
H 226103
 
5.1%
Other values (21) 1412731
31.9%

recordNumber
Text

Missing 

Distinct5837
Distinct (%)43.9%
Missing277608
Missing (%)95.4%
Memory size2.2 MiB
2025-01-08T18:40:14.736243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length22
Mean length4.631226486
Min length1

Characters and Unicode

Total characters61549
Distinct characters73
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4106 ?
Unique (%)30.9%

Sample

1st row1.3
2nd row4.3
3rd row6.4
4th row15
5th row175
ValueCountFrequency (%)
no 3016
 
17.2%
reg 601
 
3.4%
reg.no 175
 
1.0%
n 85
 
0.5%
verz 57
 
0.3%
coll.-no 49
 
0.3%
2 47
 
0.3%
3 41
 
0.2%
1 41
 
0.2%
6 34
 
0.2%
Other values (4160) 13389
76.4%
2025-01-08T18:40:14.963056image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 7134
11.6%
4 4703
 
7.6%
3 4671
 
7.6%
2 4607
 
7.5%
4247
 
6.9%
. 4085
 
6.6%
5 3931
 
6.4%
6 3619
 
5.9%
7 3512
 
5.7%
o 3431
 
5.6%
Other values (63) 17609
28.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 41965
68.2%
Lowercase Letter 6638
 
10.8%
Other Punctuation 4273
 
6.9%
Space Separator 4247
 
6.9%
Uppercase Letter 4115
 
6.7%
Close Punctuation 103
 
0.2%
Open Punctuation 103
 
0.2%
Dash Punctuation 81
 
0.1%
Math Symbol 24
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3431
51.7%
e 965
 
14.5%
g 815
 
12.3%
n 422
 
6.4%
r 223
 
3.4%
l 215
 
3.2%
v 79
 
1.2%
a 76
 
1.1%
z 73
 
1.1%
c 65
 
1.0%
Other values (14) 274
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
N 3030
73.6%
R 739
 
18.0%
C 140
 
3.4%
I 65
 
1.6%
X 32
 
0.8%
V 16
 
0.4%
A 15
 
0.4%
G 13
 
0.3%
B 12
 
0.3%
L 12
 
0.3%
Other values (13) 41
 
1.0%
Decimal Number
ValueCountFrequency (%)
1 7134
17.0%
4 4703
11.2%
3 4671
11.1%
2 4607
11.0%
5 3931
9.4%
6 3619
8.6%
7 3512
8.4%
8 3321
7.9%
0 3240
7.7%
9 3227
7.7%
Other Punctuation
ValueCountFrequency (%)
. 4085
95.6%
: 105
 
2.5%
' 30
 
0.7%
, 16
 
0.4%
/ 16
 
0.4%
? 15
 
0.4%
; 4
 
0.1%
& 1
 
< 0.1%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 101
98.1%
] 2
 
1.9%
Open Punctuation
ValueCountFrequency (%)
( 101
98.1%
[ 2
 
1.9%
Space Separator
ValueCountFrequency (%)
4247
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 81
100.0%
Math Symbol
ValueCountFrequency (%)
= 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50796
82.5%
Latin 10753
 
17.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3431
31.9%
N 3030
28.2%
e 965
 
9.0%
g 815
 
7.6%
R 739
 
6.9%
n 422
 
3.9%
r 223
 
2.1%
l 215
 
2.0%
C 140
 
1.3%
v 79
 
0.7%
Other values (37) 694
 
6.5%
Common
ValueCountFrequency (%)
1 7134
14.0%
4 4703
9.3%
3 4671
9.2%
2 4607
9.1%
4247
8.4%
. 4085
8.0%
5 3931
7.7%
6 3619
7.1%
7 3512
6.9%
8 3321
6.5%
Other values (16) 6966
13.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 61548
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 7134
11.6%
4 4703
 
7.6%
3 4671
 
7.6%
2 4607
 
7.5%
4247
 
6.9%
. 4085
 
6.6%
5 3931
 
6.4%
6 3619
 
5.9%
7 3512
 
5.7%
o 3431
 
5.6%
Other values (62) 17608
28.6%
Punctuation
ValueCountFrequency (%)
1
100.0%

recordedBy
Text

Missing 

Distinct11884
Distinct (%)6.0%
Missing93217
Missing (%)32.0%
Memory size2.2 MiB
2025-01-08T18:40:15.150064image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length252
Median length227
Mean length15.05396573
Min length2

Characters and Unicode

Total characters2975883
Distinct characters101
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6886 ?
Unique (%)3.5%

Sample

1st rowVan der Spruyt G.S.
2nd rowGroen J.
3rd rowPollen&vDam cf Apr'63-Jun'66
4th rowPloos van Amstel D.
5th rowEbels E.
ValueCountFrequency (%)
van 28494
 
5.3%
not 14646
 
2.7%
stated 13574
 
2.5%
12973
 
2.4%
bartels 11552
 
2.2%
j 10799
 
2.0%
de 10438
 
1.9%
heurn 8706
 
1.6%
m.e.g 8361
 
1.6%
f 7251
 
1.4%
Other values (8569) 408661
76.3%
2025-01-08T18:40:15.419883image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 347920
 
11.7%
339595
 
11.4%
e 267162
 
9.0%
n 167094
 
5.6%
a 147430
 
5.0%
r 142614
 
4.8%
o 125459
 
4.2%
t 117646
 
4.0%
s 116659
 
3.9%
l 83016
 
2.8%
Other values (91) 1121288
37.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1637630
55.0%
Uppercase Letter 610104
 
20.5%
Other Punctuation 377070
 
12.7%
Space Separator 339595
 
11.4%
Decimal Number 4048
 
0.1%
Open Punctuation 2717
 
0.1%
Close Punctuation 2714
 
0.1%
Dash Punctuation 1952
 
0.1%
Math Symbol 52
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 267162
16.3%
n 167094
10.2%
a 147430
9.0%
r 142614
8.7%
o 125459
 
7.7%
t 117646
 
7.2%
s 116659
 
7.1%
l 83016
 
5.1%
i 72781
 
4.4%
d 62897
 
3.8%
Other values (34) 334872
20.4%
Uppercase Letter
ValueCountFrequency (%)
H 61957
 
10.2%
J 50762
 
8.3%
B 47748
 
7.8%
A 40657
 
6.7%
M 36414
 
6.0%
C 35281
 
5.8%
G 34607
 
5.7%
F 31023
 
5.1%
P 30114
 
4.9%
S 27132
 
4.4%
Other values (17) 214409
35.1%
Other Punctuation
ValueCountFrequency (%)
. 347920
92.3%
& 12702
 
3.4%
: 6268
 
1.7%
; 5183
 
1.4%
/ 1659
 
0.4%
\ 1596
 
0.4%
' 996
 
0.3%
? 377
 
0.1%
" 294
 
0.1%
! 60
 
< 0.1%
Other values (2) 15
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1074
26.5%
9 747
18.5%
0 629
15.5%
6 456
11.3%
2 349
 
8.6%
3 311
 
7.7%
8 215
 
5.3%
4 133
 
3.3%
7 76
 
1.9%
5 58
 
1.4%
Math Symbol
ValueCountFrequency (%)
= 38
73.1%
> 7
 
13.5%
+ 7
 
13.5%
Space Separator
ValueCountFrequency (%)
339595
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2717
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2714
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1952
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2247734
75.5%
Common 728149
 
24.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 267162
 
11.9%
n 167094
 
7.4%
a 147430
 
6.6%
r 142614
 
6.3%
o 125459
 
5.6%
t 117646
 
5.2%
s 116659
 
5.2%
l 83016
 
3.7%
i 72781
 
3.2%
d 62897
 
2.8%
Other values (61) 944976
42.0%
Common
ValueCountFrequency (%)
. 347920
47.8%
339595
46.6%
& 12702
 
1.7%
: 6268
 
0.9%
; 5183
 
0.7%
( 2717
 
0.4%
) 2714
 
0.4%
- 1952
 
0.3%
/ 1659
 
0.2%
\ 1596
 
0.2%
Other values (20) 5843
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2968095
99.7%
None 7776
 
0.3%
Punctuation 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 347920
 
11.7%
339595
 
11.4%
e 267162
 
9.0%
n 167094
 
5.6%
a 147430
 
5.0%
r 142614
 
4.8%
o 125459
 
4.2%
t 117646
 
4.0%
s 116659
 
3.9%
l 83016
 
2.8%
Other values (71) 1113500
37.5%
None
ValueCountFrequency (%)
ü 5145
66.2%
é 1008
 
13.0%
ä 847
 
10.9%
ö 419
 
5.4%
ñ 143
 
1.8%
ø 118
 
1.5%
ë 34
 
0.4%
è 20
 
0.3%
ó 15
 
0.2%
û 8
 
0.1%
Other values (9) 19
 
0.2%
Punctuation
ValueCountFrequency (%)
12
100.0%

individualCount
Text

Missing 

Distinct54
Distinct (%)< 0.1%
Missing30538
Missing (%)10.5%
Memory size2.2 MiB
2025-01-08T18:40:15.481699image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length1
Mean length1.003725611
Min length1

Characters and Unicode

Total characters261330
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 228649
87.8%
2 11832
 
4.5%
3 6214
 
2.4%
4 5617
 
2.2%
5 3939
 
1.5%
6 1721
 
0.7%
7 695
 
0.3%
8 426
 
0.2%
9 305
 
0.1%
10 260
 
0.1%
Other values (44) 702
 
0.3%
2025-01-08T18:40:15.598339image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 229500
87.8%
2 12051
 
4.6%
3 6372
 
2.4%
4 5687
 
2.2%
5 4035
 
1.5%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 261330
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 229500
87.8%
2 12051
 
4.6%
3 6372
 
2.4%
4 5687
 
2.2%
5 4035
 
1.5%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 261330
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 229500
87.8%
2 12051
 
4.6%
3 6372
 
2.4%
4 5687
 
2.2%
5 4035
 
1.5%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 261330
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 229500
87.8%
2 12051
 
4.6%
3 6372
 
2.4%
4 5687
 
2.2%
5 4035
 
1.5%
6 1786
 
0.7%
7 749
 
0.3%
8 468
 
0.2%
9 372
 
0.1%
0 310
 
0.1%

sex
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing98571
Missing (%)33.9%
Memory size2.2 MiB
2025-01-08T18:40:15.641335image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length4
Mean length4.830928575
Min length4

Characters and Unicode

Total characters929118
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFEMALE
2nd rowFEMALE
3rd rowMALE
4th rowMALE
5th rowFEMALE
ValueCountFrequency (%)
male 112422
58.5%
female 79905
41.5%
2025-01-08T18:40:15.744927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 272232
29.3%
M 192327
20.7%
A 192327
20.7%
L 192327
20.7%
F 79905
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 929118
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 272232
29.3%
M 192327
20.7%
A 192327
20.7%
L 192327
20.7%
F 79905
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 929118
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 272232
29.3%
M 192327
20.7%
A 192327
20.7%
L 192327
20.7%
F 79905
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 929118
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 272232
29.3%
M 192327
20.7%
A 192327
20.7%
L 192327
20.7%
F 79905
 
8.6%

lifeStage
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing210308
Missing (%)72.3%
Memory size2.2 MiB
2025-01-08T18:40:15.897683image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length4.64470778
Min length3

Characters and Unicode

Total characters374317
Distinct characters22
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEgg
2nd rowAdult
3rd rowAdult
4th rowImmature
5th rowJuvenile
ValueCountFrequency (%)
egg 41586
51.6%
adult 20821
25.8%
juvenile 13228
 
16.4%
nestling 3308
 
4.1%
immature 1546
 
1.9%
subadult 96
 
0.1%
embryo 5
 
< 0.1%
2025-01-08T18:40:15.994630image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
g 86480
23.1%
E 41591
11.1%
l 37453
10.0%
u 35787
9.6%
e 31310
 
8.4%
t 25771
 
6.9%
d 20917
 
5.6%
A 20821
 
5.6%
n 16536
 
4.4%
i 16536
 
4.4%
Other values (12) 41115
11.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 293727
78.5%
Uppercase Letter 80590
 
21.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 86480
29.4%
l 37453
12.8%
u 35787
12.2%
e 31310
 
10.7%
t 25771
 
8.8%
d 20917
 
7.1%
n 16536
 
5.6%
i 16536
 
5.6%
v 13228
 
4.5%
s 3308
 
1.1%
Other values (6) 6401
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
E 41591
51.6%
A 20821
25.8%
J 13228
 
16.4%
N 3308
 
4.1%
I 1546
 
1.9%
S 96
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 374317
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 86480
23.1%
E 41591
11.1%
l 37453
10.0%
u 35787
9.6%
e 31310
 
8.4%
t 25771
 
6.9%
d 20917
 
5.6%
A 20821
 
5.6%
n 16536
 
4.4%
i 16536
 
4.4%
Other values (12) 41115
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 374317
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
g 86480
23.1%
E 41591
11.1%
l 37453
10.0%
u 35787
9.6%
e 31310
 
8.4%
t 25771
 
6.9%
d 20917
 
5.6%
A 20821
 
5.6%
n 16536
 
4.4%
i 16536
 
4.4%
Other values (12) 41115
11.0%

occurrenceStatus
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:16.033989image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters2036286
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPRESENT
2nd rowPRESENT
3rd rowPRESENT
4th rowPRESENT
5th rowPRESENT
ValueCountFrequency (%)
present 290898
100.0%
2025-01-08T18:40:16.127380image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 581796
28.6%
P 290898
14.3%
R 290898
14.3%
S 290898
14.3%
N 290898
14.3%
T 290898
14.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2036286
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 581796
28.6%
P 290898
14.3%
R 290898
14.3%
S 290898
14.3%
N 290898
14.3%
T 290898
14.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 2036286
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 581796
28.6%
P 290898
14.3%
R 290898
14.3%
S 290898
14.3%
N 290898
14.3%
T 290898
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2036286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 581796
28.6%
P 290898
14.3%
R 290898
14.3%
S 290898
14.3%
N 290898
14.3%
T 290898
14.3%
Distinct132
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:16.191027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length37
Mean length16.94541729
Min length3

Characters and Unicode

Total characters4929388
Distinct characters44
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)< 0.1%

Sample

1st rowskin (mounted skin)
2nd rowegg (air dried)
3rd rowskin (study skin)
4th rowskin (mounted skin)
5th rowskin (study skin)
ValueCountFrequency (%)
skin 382852
44.6%
air 114350
 
13.3%
dried 114350
 
13.3%
study 108973
 
12.7%
mounted 47886
 
5.6%
egg 41587
 
4.8%
skeletonized 7000
 
0.8%
skeleton 5297
 
0.6%
nest 4725
 
0.5%
whole 4690
 
0.5%
Other values (57) 27515
 
3.2%
2025-01-08T18:40:16.329822image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 633806
12.9%
568327
11.5%
s 523666
10.6%
n 456393
9.3%
k 398662
8.1%
d 395200
8.0%
) 290700
 
5.9%
( 290700
 
5.9%
e 260863
 
5.3%
r 234948
 
4.8%
Other values (34) 876123
17.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3771413
76.5%
Space Separator 568327
 
11.5%
Close Punctuation 290700
 
5.9%
Open Punctuation 290700
 
5.9%
Uppercase Letter 6292
 
0.1%
Decimal Number 1128
 
< 0.1%
Other Punctuation 601
 
< 0.1%
Math Symbol 217
 
< 0.1%
Dash Punctuation 8
 
< 0.1%
Modifier Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 633806
16.8%
s 523666
13.9%
n 456393
12.1%
k 398662
10.6%
d 395200
10.5%
e 260863
6.9%
r 234948
 
6.2%
t 180647
 
4.8%
u 161596
 
4.3%
a 122202
 
3.2%
Other values (13) 403430
10.7%
Uppercase Letter
ValueCountFrequency (%)
W 5239
83.3%
O 580
 
9.2%
H 322
 
5.1%
B 88
 
1.4%
L 34
 
0.5%
D 8
 
0.1%
N 8
 
0.1%
A 8
 
0.1%
T 5
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 336
29.8%
6 336
29.8%
7 228
20.2%
0 228
20.2%
Other Punctuation
ValueCountFrequency (%)
% 564
93.8%
& 37
 
6.2%
Space Separator
ValueCountFrequency (%)
568327
100.0%
Close Punctuation
ValueCountFrequency (%)
) 290700
100.0%
Open Punctuation
ValueCountFrequency (%)
( 290700
100.0%
Math Symbol
ValueCountFrequency (%)
> 217
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3777705
76.6%
Common 1151683
 
23.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 633806
16.8%
s 523666
13.9%
n 456393
12.1%
k 398662
10.6%
d 395200
10.5%
e 260863
6.9%
r 234948
 
6.2%
t 180647
 
4.8%
u 161596
 
4.3%
a 122202
 
3.2%
Other values (22) 409722
10.8%
Common
ValueCountFrequency (%)
568327
49.3%
) 290700
25.2%
( 290700
25.2%
% 564
 
< 0.1%
9 336
 
< 0.1%
6 336
 
< 0.1%
7 228
 
< 0.1%
0 228
 
< 0.1%
> 217
 
< 0.1%
& 37
 
< 0.1%
Other values (2) 10
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4929388
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 633806
12.9%
568327
11.5%
s 523666
10.6%
n 456393
9.3%
k 398662
8.1%
d 395200
8.0%
) 290700
 
5.9%
( 290700
 
5.9%
e 260863
 
5.3%
r 234948
 
4.8%
Other values (34) 876123
17.8%

associatedTaxa
Text

Constant  Missing 

Distinct1
Distinct (%)33.3%
Missing290895
Missing (%)> 99.9%
Memory size2.2 MiB
2025-01-08T18:40:16.384479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length64
Median length64
Mean length64
Min length64

Characters and Unicode

Total characters192
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhas parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp.
2nd rowhas parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp.
3rd rowhas parasite: Cirrophthirius cf. recurvirostrae | Quadraceps sp.
ValueCountFrequency (%)
has 3
12.5%
parasite 3
12.5%
cirrophthirius 3
12.5%
cf 3
12.5%
recurvirostrae 3
12.5%
3
12.5%
quadraceps 3
12.5%
sp 3
12.5%
2025-01-08T18:40:16.490855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 27
14.1%
21
10.9%
s 18
9.4%
a 18
9.4%
i 15
 
7.8%
p 12
 
6.2%
e 12
 
6.2%
h 9
 
4.7%
t 9
 
4.7%
u 9
 
4.7%
Other values (10) 42
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 153
79.7%
Space Separator 21
 
10.9%
Other Punctuation 9
 
4.7%
Uppercase Letter 6
 
3.1%
Math Symbol 3
 
1.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 27
17.6%
s 18
11.8%
a 18
11.8%
i 15
9.8%
p 12
7.8%
e 12
7.8%
h 9
 
5.9%
t 9
 
5.9%
u 9
 
5.9%
c 9
 
5.9%
Other values (4) 15
9.8%
Other Punctuation
ValueCountFrequency (%)
. 6
66.7%
: 3
33.3%
Uppercase Letter
ValueCountFrequency (%)
Q 3
50.0%
C 3
50.0%
Space Separator
ValueCountFrequency (%)
21
100.0%
Math Symbol
ValueCountFrequency (%)
| 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 159
82.8%
Common 33
 
17.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 27
17.0%
s 18
11.3%
a 18
11.3%
i 15
9.4%
p 12
7.5%
e 12
7.5%
h 9
 
5.7%
t 9
 
5.7%
u 9
 
5.7%
c 9
 
5.7%
Other values (6) 21
13.2%
Common
ValueCountFrequency (%)
21
63.6%
. 6
 
18.2%
| 3
 
9.1%
: 3
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 192
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 27
14.1%
21
10.9%
s 18
9.4%
a 18
9.4%
i 15
 
7.8%
p 12
 
6.2%
e 12
 
6.2%
h 9
 
4.7%
t 9
 
4.7%
u 9
 
4.7%
Other values (10) 42
21.9%

eventDate
Text

Missing 

Distinct44850
Distinct (%)20.7%
Missing74430
Missing (%)25.6%
Memory size2.2 MiB
2025-01-08T18:40:16.620413image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length11.38132657
Min length10

Characters and Unicode

Total characters2463693
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12115 ?
Unique (%)5.6%

Sample

1st row1904-07-15
2nd row1887-11-19
3rd row2014-01-05
4th row2008-09-09
5th row2006-04-22
ValueCountFrequency (%)
1875-10-01/1875-10-31 571
 
0.3%
1901-01-01/1901-12-31 442
 
0.2%
1930-01-01/1951-12-31 384
 
0.2%
1912-01-01/1916-12-31 312
 
0.1%
1820-12-01/1821-09-30 311
 
0.1%
1862-01-01/1862-12-31 295
 
0.1%
1903-01-01/1908-12-31 285
 
0.1%
1868-01-01/1868-12-31 283
 
0.1%
1982-01-01/1982-12-31 260
 
0.1%
1861-01-01/1861-12-31 242
 
0.1%
Other values (44840) 213083
98.4%
2025-01-08T18:40:16.811523image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 535271
21.7%
- 487302
19.8%
0 371646
15.1%
9 254117
10.3%
2 178899
 
7.3%
8 135169
 
5.5%
3 113107
 
4.6%
6 104773
 
4.3%
5 96538
 
3.9%
7 81317
 
3.3%
Other values (2) 105554
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1949208
79.1%
Dash Punctuation 487302
 
19.8%
Other Punctuation 27183
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 535271
27.5%
0 371646
19.1%
9 254117
13.0%
2 178899
 
9.2%
8 135169
 
6.9%
3 113107
 
5.8%
6 104773
 
5.4%
5 96538
 
5.0%
7 81317
 
4.2%
4 78371
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 487302
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 27183
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2463693
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 535271
21.7%
- 487302
19.8%
0 371646
15.1%
9 254117
10.3%
2 178899
 
7.3%
8 135169
 
5.5%
3 113107
 
4.6%
6 104773
 
4.3%
5 96538
 
3.9%
7 81317
 
3.3%
Other values (2) 105554
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2463693
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 535271
21.7%
- 487302
19.8%
0 371646
15.1%
9 254117
10.3%
2 178899
 
7.3%
8 135169
 
5.5%
3 113107
 
4.6%
6 104773
 
4.3%
5 96538
 
3.9%
7 81317
 
3.3%
Other values (2) 105554
 
4.3%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.2%
Missing74430
Missing (%)25.6%
Memory size2.2 MiB
2025-01-08T18:40:17.007310image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.633331485
Min length1

Characters and Unicode

Total characters570032
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row197
2nd row323
3rd row5
4th row253
5th row112
ValueCountFrequency (%)
1 13089
 
6.0%
121 2472
 
1.1%
274 1859
 
0.9%
91 1681
 
0.8%
152 1646
 
0.8%
60 1523
 
0.7%
32 1509
 
0.7%
122 1452
 
0.7%
153 1253
 
0.6%
305 1230
 
0.6%
Other values (356) 188754
87.2%
2025-01-08T18:40:17.271138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 130062
22.8%
2 94786
16.6%
3 75811
13.3%
4 44773
 
7.9%
5 42295
 
7.4%
6 39955
 
7.0%
0 36420
 
6.4%
7 36016
 
6.3%
8 35025
 
6.1%
9 34889
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 570032
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 130062
22.8%
2 94786
16.6%
3 75811
13.3%
4 44773
 
7.9%
5 42295
 
7.4%
6 39955
 
7.0%
0 36420
 
6.4%
7 36016
 
6.3%
8 35025
 
6.1%
9 34889
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 570032
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 130062
22.8%
2 94786
16.6%
3 75811
13.3%
4 44773
 
7.9%
5 42295
 
7.4%
6 39955
 
7.0%
0 36420
 
6.4%
7 36016
 
6.3%
8 35025
 
6.1%
9 34889
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 570032
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 130062
22.8%
2 94786
16.6%
3 75811
13.3%
4 44773
 
7.9%
5 42295
 
7.4%
6 39955
 
7.0%
0 36420
 
6.4%
7 36016
 
6.3%
8 35025
 
6.1%
9 34889
 
6.1%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.2%
Missing74430
Missing (%)25.6%
Memory size2.2 MiB
2025-01-08T18:40:17.468457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.752028013
Min length1

Characters and Unicode

Total characters595726
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row197
2nd row323
3rd row5
4th row253
5th row112
ValueCountFrequency (%)
365 9577
 
4.4%
366 3449
 
1.6%
120 1930
 
0.9%
304 1835
 
0.8%
151 1825
 
0.8%
273 1697
 
0.8%
90 1423
 
0.7%
121 1386
 
0.6%
181 1326
 
0.6%
59 1319
 
0.6%
Other values (356) 190701
88.1%
2025-01-08T18:40:17.735546image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 118478
19.9%
2 92661
15.6%
3 89296
15.0%
6 54005
9.1%
5 51255
8.6%
4 45056
 
7.6%
0 38299
 
6.4%
7 35879
 
6.0%
9 35493
 
6.0%
8 35304
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 595726
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 118478
19.9%
2 92661
15.6%
3 89296
15.0%
6 54005
9.1%
5 51255
8.6%
4 45056
 
7.6%
0 38299
 
6.4%
7 35879
 
6.0%
9 35493
 
6.0%
8 35304
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Common 595726
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 118478
19.9%
2 92661
15.6%
3 89296
15.0%
6 54005
9.1%
5 51255
8.6%
4 45056
 
7.6%
0 38299
 
6.4%
7 35879
 
6.0%
9 35493
 
6.0%
8 35304
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 595726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 118478
19.9%
2 92661
15.6%
3 89296
15.0%
6 54005
9.1%
5 51255
8.6%
4 45056
 
7.6%
0 38299
 
6.4%
7 35879
 
6.0%
9 35493
 
6.0%
8 35304
 
5.9%

year
Text

Missing 

Distinct227
Distinct (%)0.1%
Missing78830
Missing (%)27.1%
Memory size2.2 MiB
2025-01-08T18:40:17.947512image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters848272
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st row1904
2nd row1887
3rd row2014
4th row2008
5th row2006
ValueCountFrequency (%)
1909 4335
 
2.0%
1910 4083
 
1.9%
1913 3479
 
1.6%
1912 3322
 
1.6%
1920 3154
 
1.5%
1908 3022
 
1.4%
1907 2956
 
1.4%
1911 2923
 
1.4%
1968 2883
 
1.4%
1919 2827
 
1.3%
Other values (217) 179084
84.4%
2025-01-08T18:40:18.277343image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 260246
30.7%
9 199374
23.5%
8 79743
 
9.4%
6 55292
 
6.5%
0 54406
 
6.4%
2 47998
 
5.7%
7 41819
 
4.9%
5 39221
 
4.6%
3 37532
 
4.4%
4 32641
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 848272
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 260246
30.7%
9 199374
23.5%
8 79743
 
9.4%
6 55292
 
6.5%
0 54406
 
6.4%
2 47998
 
5.7%
7 41819
 
4.9%
5 39221
 
4.6%
3 37532
 
4.4%
4 32641
 
3.8%

Most occurring scripts

ValueCountFrequency (%)
Common 848272
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 260246
30.7%
9 199374
23.5%
8 79743
 
9.4%
6 55292
 
6.5%
0 54406
 
6.4%
2 47998
 
5.7%
7 41819
 
4.9%
5 39221
 
4.6%
3 37532
 
4.4%
4 32641
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 848272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 260246
30.7%
9 199374
23.5%
8 79743
 
9.4%
6 55292
 
6.5%
0 54406
 
6.4%
2 47998
 
5.7%
7 41819
 
4.9%
5 39221
 
4.6%
3 37532
 
4.4%
4 32641
 
3.8%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing87276
Missing (%)30.0%
Memory size2.2 MiB
2025-01-08T18:40:18.337667image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.22423412
Min length1

Characters and Unicode

Total characters249281
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7
2nd row11
3rd row1
4th row9
5th row4
ValueCountFrequency (%)
5 29179
14.3%
4 21061
10.3%
6 20812
10.2%
10 17835
8.8%
3 16215
8.0%
11 14949
7.3%
9 14742
7.2%
1 14411
7.1%
2 14384
7.1%
7 13920
6.8%
Other values (2) 26114
12.8%
2025-01-08T18:40:18.442142image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 75019
30.1%
5 29179
 
11.7%
2 27259
 
10.9%
4 21061
 
8.4%
6 20812
 
8.3%
0 17835
 
7.2%
3 16215
 
6.5%
9 14742
 
5.9%
7 13920
 
5.6%
8 13239
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 249281
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 75019
30.1%
5 29179
 
11.7%
2 27259
 
10.9%
4 21061
 
8.4%
6 20812
 
8.3%
0 17835
 
7.2%
3 16215
 
6.5%
9 14742
 
5.9%
7 13920
 
5.6%
8 13239
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Common 249281
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 75019
30.1%
5 29179
 
11.7%
2 27259
 
10.9%
4 21061
 
8.4%
6 20812
 
8.3%
0 17835
 
7.2%
3 16215
 
6.5%
9 14742
 
5.9%
7 13920
 
5.6%
8 13239
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 249281
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 75019
30.1%
5 29179
 
11.7%
2 27259
 
10.9%
4 21061
 
8.4%
6 20812
 
8.3%
0 17835
 
7.2%
3 16215
 
6.5%
9 14742
 
5.9%
7 13920
 
5.6%
8 13239
 
5.3%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing101613
Missing (%)34.9%
Memory size2.2 MiB
2025-01-08T18:40:18.516473image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.704292469
Min length1

Characters and Unicode

Total characters322597
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row15
2nd row19
3rd row5
4th row9
5th row22
ValueCountFrequency (%)
15 7094
 
3.7%
1 7019
 
3.7%
10 6933
 
3.7%
20 6929
 
3.7%
18 6540
 
3.5%
5 6461
 
3.4%
12 6454
 
3.4%
16 6383
 
3.4%
25 6370
 
3.4%
2 6356
 
3.4%
Other values (21) 122746
64.8%
2025-01-08T18:40:18.644048image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 84961
26.3%
2 80139
24.8%
3 26438
 
8.2%
5 19925
 
6.2%
0 19527
 
6.1%
6 18700
 
5.8%
8 18677
 
5.8%
7 18412
 
5.7%
4 18316
 
5.7%
9 17502
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 322597
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 84961
26.3%
2 80139
24.8%
3 26438
 
8.2%
5 19925
 
6.2%
0 19527
 
6.1%
6 18700
 
5.8%
8 18677
 
5.8%
7 18412
 
5.7%
4 18316
 
5.7%
9 17502
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 322597
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 84961
26.3%
2 80139
24.8%
3 26438
 
8.2%
5 19925
 
6.2%
0 19527
 
6.1%
6 18700
 
5.8%
8 18677
 
5.8%
7 18412
 
5.7%
4 18316
 
5.7%
9 17502
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 322597
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 84961
26.3%
2 80139
24.8%
3 26438
 
8.2%
5 19925
 
6.2%
0 19527
 
6.1%
6 18700
 
5.8%
8 18677
 
5.8%
7 18412
 
5.7%
4 18316
 
5.7%
9 17502
 
5.4%

verbatimEventDate
Text

Missing 

Distinct75505
Distinct (%)32.7%
Missing59902
Missing (%)20.6%
Memory size2.2 MiB
2025-01-08T18:40:18.841028image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length255
Median length10
Mean length10.37595889
Min length1

Characters and Unicode

Total characters2396805
Distinct characters100
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36235 ?
Unique (%)15.7%

Sample

1st row15/7/1904
2nd row19-11-1887
3rd rowbefore 1880
4th row5 januari 2014
5th row9 september 2008
ValueCountFrequency (%)
5954
 
2.0%
on 4818
 
1.6%
label 4338
 
1.5%
may 2008
 
0.7%
april 1649
 
0.6%
september 1517
 
0.5%
october 1257
 
0.4%
june 1251
 
0.4%
december 1227
 
0.4%
november 1156
 
0.4%
Other values (69619) 268785
91.4%
2025-01-08T18:40:19.119408image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 471619
19.7%
- 339877
14.2%
9 255264
10.7%
0 218095
9.1%
2 170497
 
7.1%
8 129945
 
5.4%
6 103678
 
4.3%
5 96019
 
4.0%
3 94216
 
3.9%
/ 82976
 
3.5%
Other values (90) 434619
18.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1700058
70.9%
Dash Punctuation 339878
 
14.2%
Lowercase Letter 165606
 
6.9%
Other Punctuation 99666
 
4.2%
Space Separator 64354
 
2.7%
Uppercase Letter 26074
 
1.1%
Math Symbol 629
 
< 0.1%
Open Punctuation 269
 
< 0.1%
Close Punctuation 267
 
< 0.1%
Modifier Symbol 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 26055
15.7%
a 16564
10.0%
r 15695
9.5%
l 15620
9.4%
b 12889
 
7.8%
n 11626
 
7.0%
u 9103
 
5.5%
o 8700
 
5.3%
t 7075
 
4.3%
i 6959
 
4.2%
Other values (26) 35320
21.3%
Uppercase Letter
ValueCountFrequency (%)
M 4293
16.5%
O 4247
16.3%
J 3844
14.7%
A 2880
11.0%
N 1997
7.7%
D 1853
7.1%
S 1736
6.7%
I 1097
 
4.2%
F 1089
 
4.2%
H 788
 
3.0%
Other values (16) 2250
8.6%
Other Punctuation
ValueCountFrequency (%)
/ 82976
83.3%
, 5609
 
5.6%
: 5189
 
5.2%
. 3974
 
4.0%
' 733
 
0.7%
\ 686
 
0.7%
? 373
 
0.4%
" 49
 
< 0.1%
; 34
 
< 0.1%
! 24
 
< 0.1%
Other values (3) 19
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 471619
27.7%
9 255264
15.0%
0 218095
12.8%
2 170497
 
10.0%
8 129945
 
7.6%
6 103678
 
6.1%
5 96019
 
5.6%
3 94216
 
5.5%
7 81858
 
4.8%
4 78867
 
4.6%
Math Symbol
ValueCountFrequency (%)
± 585
93.0%
> 16
 
2.5%
< 14
 
2.2%
+ 10
 
1.6%
= 4
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
- 339877
> 99.9%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 185
68.8%
[ 84
31.2%
Close Punctuation
ValueCountFrequency (%)
) 184
68.9%
] 83
31.1%
Space Separator
ValueCountFrequency (%)
64354
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 2
100.0%
Other Number
ValueCountFrequency (%)
½ 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2205125
92.0%
Latin 191680
 
8.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 26055
13.6%
a 16564
 
8.6%
r 15695
 
8.2%
l 15620
 
8.1%
b 12889
 
6.7%
n 11626
 
6.1%
u 9103
 
4.7%
o 8700
 
4.5%
t 7075
 
3.7%
i 6959
 
3.6%
Other values (52) 61394
32.0%
Common
ValueCountFrequency (%)
1 471619
21.4%
- 339877
15.4%
9 255264
11.6%
0 218095
9.9%
2 170497
 
7.7%
8 129945
 
5.9%
6 103678
 
4.7%
5 96019
 
4.4%
3 94216
 
4.3%
/ 82976
 
3.8%
Other values (28) 242939
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2396058
> 99.9%
None 739
 
< 0.1%
Punctuation 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 471619
19.7%
- 339877
14.2%
9 255264
10.7%
0 218095
9.1%
2 170497
 
7.1%
8 129945
 
5.4%
6 103678
 
4.3%
5 96019
 
4.0%
3 94216
 
3.9%
/ 82976
 
3.5%
Other values (75) 433872
18.1%
None
ValueCountFrequency (%)
± 585
79.2%
ü 63
 
8.5%
é 35
 
4.7%
ä 28
 
3.8%
â 16
 
2.2%
ó 4
 
0.5%
´ 2
 
0.3%
ï 1
 
0.1%
ò 1
 
0.1%
½ 1
 
0.1%
Other values (3) 3
 
0.4%
Punctuation
ValueCountFrequency (%)
7
87.5%
1
 
12.5%

continent
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing94391
Missing (%)32.4%
Memory size2.2 MiB
2025-01-08T18:40:19.176872image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length6
Mean length6.659584646
Min length4

Characters and Unicode

Total characters1308655
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEUROPE
2nd rowOCEANIA
3rd rowOCEANIA
4th rowOCEANIA
5th rowAFRICA
ValueCountFrequency (%)
europe 81421
41.4%
asia 55363
28.2%
south_america 25346
 
12.9%
africa 17201
 
8.8%
oceania 9475
 
4.8%
north_america 7546
 
3.8%
antarctica 155
 
0.1%
2025-01-08T18:40:19.277387image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 230327
17.6%
E 205209
15.7%
R 139215
10.6%
O 123788
9.5%
I 115086
8.8%
U 106767
8.2%
P 81421
 
6.2%
S 80709
 
6.2%
C 59878
 
4.6%
T 33202
 
2.5%
Other values (5) 133053
10.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1275763
97.5%
Connector Punctuation 32892
 
2.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 230327
18.1%
E 205209
16.1%
R 139215
10.9%
O 123788
9.7%
I 115086
9.0%
U 106767
8.4%
P 81421
 
6.4%
S 80709
 
6.3%
C 59878
 
4.7%
T 33202
 
2.6%
Other values (4) 100161
7.9%
Connector Punctuation
ValueCountFrequency (%)
_ 32892
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1275763
97.5%
Common 32892
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 230327
18.1%
E 205209
16.1%
R 139215
10.9%
O 123788
9.7%
I 115086
9.0%
U 106767
8.4%
P 81421
 
6.4%
S 80709
 
6.3%
C 59878
 
4.7%
T 33202
 
2.6%
Other values (4) 100161
7.9%
Common
ValueCountFrequency (%)
_ 32892
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1308655
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 230327
17.6%
E 205209
15.7%
R 139215
10.6%
O 123788
9.5%
I 115086
8.8%
U 106767
8.2%
P 81421
 
6.2%
S 80709
 
6.2%
C 59878
 
4.6%
T 33202
 
2.5%
Other values (5) 133053
10.2%

island
Text

Missing 

Distinct1622
Distinct (%)1.8%
Missing200600
Missing (%)69.0%
Memory size2.2 MiB
2025-01-08T18:40:19.458843image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length49
Median length47
Mean length6.738764978
Min length3

Characters and Unicode

Total characters608497
Distinct characters85
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique702 ?
Unique (%)0.8%

Sample

1st rowSouth Island
2nd rowVlieland
3rd rowMoluccas
4th rowMoluccas
5th rowMoluccas
ValueCountFrequency (%)
java 34511
32.2%
sumatra 10786
 
10.1%
celebes 5435
 
5.1%
guinea 4561
 
4.3%
new 3784
 
3.5%
borneo 3686
 
3.4%
islands 3176
 
3.0%
texel 2904
 
2.7%
sunda 2297
 
2.1%
lesser 2296
 
2.1%
Other values (1286) 33756
31.5%
2025-01-08T18:40:19.717389image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 134860
22.2%
e 54105
 
8.9%
v 35047
 
5.8%
J 34784
 
5.7%
r 30733
 
5.1%
n 29173
 
4.8%
u 26669
 
4.4%
s 25626
 
4.2%
l 23733
 
3.9%
o 22092
 
3.6%
Other values (75) 191675
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 481547
79.1%
Uppercase Letter 107161
 
17.6%
Space Separator 16894
 
2.8%
Other Punctuation 1792
 
0.3%
Open Punctuation 391
 
0.1%
Close Punctuation 391
 
0.1%
Dash Punctuation 318
 
0.1%
Decimal Number 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 134860
28.0%
e 54105
11.2%
v 35047
 
7.3%
r 30733
 
6.4%
n 29173
 
6.1%
u 26669
 
5.5%
s 25626
 
5.3%
l 23733
 
4.9%
o 22092
 
4.6%
i 18605
 
3.9%
Other values (34) 80904
16.8%
Uppercase Letter
ValueCountFrequency (%)
J 34784
32.5%
S 17177
16.0%
C 8035
 
7.5%
B 6776
 
6.3%
T 5863
 
5.5%
G 5613
 
5.2%
I 5488
 
5.1%
N 5301
 
4.9%
M 4501
 
4.2%
L 3719
 
3.5%
Other values (17) 9904
 
9.2%
Other Punctuation
ValueCountFrequency (%)
. 1136
63.4%
, 606
33.8%
? 28
 
1.6%
' 19
 
1.1%
/ 3
 
0.2%
Open Punctuation
ValueCountFrequency (%)
[ 349
89.3%
( 42
 
10.7%
Close Punctuation
ValueCountFrequency (%)
] 349
89.3%
) 42
 
10.7%
Decimal Number
ValueCountFrequency (%)
0 1
50.0%
1 1
50.0%
Space Separator
ValueCountFrequency (%)
16894
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 318
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 588708
96.7%
Common 19789
 
3.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 134860
22.9%
e 54105
 
9.2%
v 35047
 
6.0%
J 34784
 
5.9%
r 30733
 
5.2%
n 29173
 
5.0%
u 26669
 
4.5%
s 25626
 
4.4%
l 23733
 
4.0%
o 22092
 
3.8%
Other values (61) 171886
29.2%
Common
ValueCountFrequency (%)
16894
85.4%
. 1136
 
5.7%
, 606
 
3.1%
[ 349
 
1.8%
] 349
 
1.8%
- 318
 
1.6%
( 42
 
0.2%
) 42
 
0.2%
? 28
 
0.1%
' 19
 
0.1%
Other values (4) 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 606505
99.7%
None 1992
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 134860
22.2%
e 54105
 
8.9%
v 35047
 
5.8%
J 34784
 
5.7%
r 30733
 
5.1%
n 29173
 
4.8%
u 26669
 
4.4%
s 25626
 
4.2%
l 23733
 
3.9%
o 22092
 
3.6%
Other values (55) 189683
31.3%
None
ValueCountFrequency (%)
ç 1160
58.2%
ë 262
 
13.2%
é 198
 
9.9%
ø 169
 
8.5%
ö 100
 
5.0%
Ö 40
 
2.0%
á 11
 
0.6%
ü 11
 
0.6%
ã 9
 
0.5%
í 9
 
0.5%
Other values (10) 23
 
1.2%

countryCode
Text

Missing 

Distinct219
Distinct (%)0.1%
Missing47203
Missing (%)16.2%
Memory size2.2 MiB
2025-01-08T18:40:19.887837image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters487390
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowNL
2nd rowAU
3rd rowAU
4th rowAU
5th rowSN
ValueCountFrequency (%)
id 77470
31.8%
nl 69474
28.5%
sr 13923
 
5.7%
ke 3747
 
1.5%
br 3554
 
1.5%
us 3540
 
1.5%
zz 3536
 
1.5%
au 3349
 
1.4%
co 3153
 
1.3%
tw 2777
 
1.1%
Other values (209) 59172
24.3%
2025-01-08T18:40:20.097326image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 82774
17.0%
D 81393
16.7%
N 76476
15.7%
L 74514
15.3%
R 24587
 
5.0%
S 23098
 
4.7%
Z 14189
 
2.9%
E 13028
 
2.7%
T 11985
 
2.5%
C 10224
 
2.1%
Other values (16) 75122
15.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 487390
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 82774
17.0%
D 81393
16.7%
N 76476
15.7%
L 74514
15.3%
R 24587
 
5.0%
S 23098
 
4.7%
Z 14189
 
2.9%
E 13028
 
2.7%
T 11985
 
2.5%
C 10224
 
2.1%
Other values (16) 75122
15.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 487390
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 82774
17.0%
D 81393
16.7%
N 76476
15.7%
L 74514
15.3%
R 24587
 
5.0%
S 23098
 
4.7%
Z 14189
 
2.9%
E 13028
 
2.7%
T 11985
 
2.5%
C 10224
 
2.1%
Other values (16) 75122
15.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 487390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 82774
17.0%
D 81393
16.7%
N 76476
15.7%
L 74514
15.3%
R 24587
 
5.0%
S 23098
 
4.7%
Z 14189
 
2.9%
E 13028
 
2.7%
T 11985
 
2.5%
C 10224
 
2.1%
Other values (16) 75122
15.4%

stateProvince
Text

Missing 

Distinct7178
Distinct (%)4.7%
Missing137182
Missing (%)47.2%
Memory size2.2 MiB
2025-01-08T18:40:20.273527image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length80
Median length71
Mean length11.67551849
Min length1

Characters and Unicode

Total characters1794714
Distinct characters116
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3142 ?
Unique (%)2.0%

Sample

1st rowSouth Holland
2nd rowNew South Wales
3rd rowSouth Australia
4th rowQueensland
5th rowFriesland
ValueCountFrequency (%)
holland 26804
 
10.7%
north 19049
 
7.6%
south 12974
 
5.2%
preanger 9164
 
3.7%
java 8838
 
3.5%
gelderland 6562
 
2.6%
friesland 4328
 
1.7%
guinea 4302
 
1.7%
overijssel 3400
 
1.4%
utrecht 3321
 
1.3%
Other values (5330) 151490
60.5%
2025-01-08T18:40:20.553557image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 201596
 
11.2%
e 142421
 
7.9%
n 125543
 
7.0%
r 122104
 
6.8%
l 121317
 
6.8%
o 110985
 
6.2%
96516
 
5.4%
t 83085
 
4.6%
i 75782
 
4.2%
d 74750
 
4.2%
Other values (106) 640615
35.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1393309
77.6%
Uppercase Letter 254898
 
14.2%
Space Separator 96516
 
5.4%
Other Punctuation 36550
 
2.0%
Dash Punctuation 10986
 
0.6%
Close Punctuation 1011
 
0.1%
Open Punctuation 1010
 
0.1%
Decimal Number 229
 
< 0.1%
Math Symbol 201
 
< 0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 201596
14.5%
e 142421
10.2%
n 125543
9.0%
r 122104
8.8%
l 121317
8.7%
o 110985
8.0%
t 83085
 
6.0%
i 75782
 
5.4%
d 74750
 
5.4%
s 60558
 
4.3%
Other values (42) 275168
19.7%
Uppercase Letter
ValueCountFrequency (%)
N 33640
13.2%
H 31015
12.2%
S 26906
 
10.6%
P 18598
 
7.3%
G 16799
 
6.6%
B 13084
 
5.1%
C 11351
 
4.5%
W 10922
 
4.3%
J 10679
 
4.2%
M 10545
 
4.1%
Other values (21) 71359
28.0%
Decimal Number
ValueCountFrequency (%)
0 92
40.2%
1 62
27.1%
5 13
 
5.7%
6 13
 
5.7%
4 13
 
5.7%
2 11
 
4.8%
9 10
 
4.4%
3 7
 
3.1%
7 4
 
1.7%
8 4
 
1.7%
Other Punctuation
ValueCountFrequency (%)
, 19171
52.5%
. 16675
45.6%
/ 201
 
0.5%
& 170
 
0.5%
' 157
 
0.4%
: 113
 
0.3%
? 52
 
0.1%
" 8
 
< 0.1%
; 3
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
> 83
41.3%
< 83
41.3%
± 27
 
13.4%
= 8
 
4.0%
Close Punctuation
ValueCountFrequency (%)
] 779
77.1%
) 231
 
22.8%
} 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
[ 777
76.9%
( 232
 
23.0%
{ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
96516
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10986
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1648207
91.8%
Common 146507
 
8.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 201596
 
12.2%
e 142421
 
8.6%
n 125543
 
7.6%
r 122104
 
7.4%
l 121317
 
7.4%
o 110985
 
6.7%
t 83085
 
5.0%
i 75782
 
4.6%
d 74750
 
4.5%
s 60558
 
3.7%
Other values (73) 530066
32.2%
Common
ValueCountFrequency (%)
96516
65.9%
, 19171
 
13.1%
. 16675
 
11.4%
- 10986
 
7.5%
] 779
 
0.5%
[ 777
 
0.5%
( 232
 
0.2%
) 231
 
0.2%
/ 201
 
0.1%
& 170
 
0.1%
Other values (23) 769
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1788335
99.6%
None 6379
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 201596
 
11.3%
e 142421
 
8.0%
n 125543
 
7.0%
r 122104
 
6.8%
l 121317
 
6.8%
o 110985
 
6.2%
96516
 
5.4%
t 83085
 
4.6%
i 75782
 
4.2%
d 74750
 
4.2%
Other values (73) 634236
35.5%
None
ValueCountFrequency (%)
â 2265
35.5%
ë 2122
33.3%
ä 509
 
8.0%
é 410
 
6.4%
ü 208
 
3.3%
ô 192
 
3.0%
ö 128
 
2.0%
è 126
 
2.0%
á 90
 
1.4%
å 55
 
0.9%
Other values (23) 274
 
4.3%

locality
Text

Missing 

Distinct29704
Distinct (%)14.1%
Missing79647
Missing (%)27.4%
Memory size2.2 MiB
2025-01-08T18:40:20.750633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35460
Median length93
Mean length16.32246001
Min length2

Characters and Unicode

Total characters3448136
Distinct characters135
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16268 ?
Unique (%)7.7%

Sample

1st rowLisse
2nd rowNew South Wales, no further locality
3rd rowKangaroo I.
4th rowsine loco [SW & SE Australia]
5th rowSenegal, no further locality
ValueCountFrequency (%)
locality 9277
 
1.9%
no 9263
 
1.9%
further 9250
 
1.9%
i 8571
 
1.8%
java 8148
 
1.7%
sine 6339
 
1.3%
loco 6337
 
1.3%
west 5995
 
1.2%
area 5203
 
1.1%
pangerango 4791
 
1.0%
Other values (25114) 414054
85.0%
2025-01-08T18:40:21.012940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 341144
 
9.9%
e 314678
 
9.1%
273926
 
7.9%
n 233745
 
6.8%
r 210253
 
6.1%
o 208503
 
6.0%
i 173894
 
5.0%
t 132266
 
3.8%
l 130259
 
3.8%
s 107689
 
3.1%
Other values (125) 1321779
38.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2573475
74.6%
Uppercase Letter 400106
 
11.6%
Space Separator 273926
 
7.9%
Other Punctuation 115906
 
3.4%
Decimal Number 23484
 
0.7%
Close Punctuation 18991
 
0.6%
Open Punctuation 18990
 
0.6%
Dash Punctuation 11570
 
0.3%
Control 7136
 
0.2%
Math Symbol 3131
 
0.1%
Other values (7) 1421
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 341144
13.3%
e 314678
12.2%
n 233745
 
9.1%
r 210253
 
8.2%
o 208503
 
8.1%
i 173894
 
6.8%
t 132266
 
5.1%
l 130259
 
5.1%
s 107689
 
4.2%
u 106314
 
4.1%
Other values (44) 614730
23.9%
Uppercase Letter
ValueCountFrequency (%)
S 39993
 
10.0%
B 32641
 
8.2%
M 28139
 
7.0%
P 27155
 
6.8%
W 26168
 
6.5%
N 20653
 
5.2%
K 19987
 
5.0%
T 18484
 
4.6%
H 18126
 
4.5%
A 17584
 
4.4%
Other values (25) 151176
37.8%
Other Punctuation
ValueCountFrequency (%)
, 71200
61.4%
. 22457
 
19.4%
' 8903
 
7.7%
/ 6607
 
5.7%
? 2997
 
2.6%
" 2017
 
1.7%
& 989
 
0.9%
: 541
 
0.5%
; 111
 
0.1%
! 70
 
0.1%
Other values (2) 14
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 6579
28.0%
1 3671
15.6%
2 2875
12.2%
5 2515
 
10.7%
3 1827
 
7.8%
4 1603
 
6.8%
6 1170
 
5.0%
8 1170
 
5.0%
9 1115
 
4.7%
7 959
 
4.1%
Math Symbol
ValueCountFrequency (%)
= 1027
32.8%
> 1022
32.6%
< 995
31.8%
± 50
 
1.6%
| 34
 
1.1%
+ 2
 
0.1%
~ 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 12701
66.9%
( 6288
33.1%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
] 12699
66.9%
) 6285
33.1%
} 7
 
< 0.1%
Control
ValueCountFrequency (%)
7104
99.6%
32
 
0.4%
Space Separator
ValueCountFrequency (%)
273926
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11570
100.0%
Other Symbol
ValueCountFrequency (%)
° 615
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 376
100.0%
Final Punctuation
ValueCountFrequency (%)
312
100.0%
Initial Punctuation
ValueCountFrequency (%)
64
100.0%
Other Letter
ValueCountFrequency (%)
º 38
100.0%
Other Number
ValueCountFrequency (%)
½ 12
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2973619
86.2%
Common 474517
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 341144
 
11.5%
e 314678
 
10.6%
n 233745
 
7.9%
r 210253
 
7.1%
o 208503
 
7.0%
i 173894
 
5.8%
t 132266
 
4.4%
l 130259
 
4.4%
s 107689
 
3.6%
u 106314
 
3.6%
Other values (80) 1014874
34.1%
Common
ValueCountFrequency (%)
273926
57.7%
, 71200
 
15.0%
. 22457
 
4.7%
[ 12701
 
2.7%
] 12699
 
2.7%
- 11570
 
2.4%
' 8903
 
1.9%
7104
 
1.5%
/ 6607
 
1.4%
0 6579
 
1.4%
Other values (35) 40771
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3441460
99.8%
None 6297
 
0.2%
Punctuation 379
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 341144
 
9.9%
e 314678
 
9.1%
273926
 
8.0%
n 233745
 
6.8%
r 210253
 
6.1%
o 208503
 
6.1%
i 173894
 
5.1%
t 132266
 
3.8%
l 130259
 
3.8%
s 107689
 
3.1%
Other values (81) 1315103
38.2%
None
ValueCountFrequency (%)
é 1764
28.0%
ö 718
11.4%
° 615
 
9.8%
ä 574
 
9.1%
â 465
 
7.4%
ü 379
 
6.0%
ë 338
 
5.4%
è 186
 
3.0%
å 160
 
2.5%
Ö 130
 
2.1%
Other values (31) 968
15.4%
Punctuation
ValueCountFrequency (%)
312
82.3%
64
 
16.9%
3
 
0.8%

verbatimElevation
Text

Missing 

Distinct716
Distinct (%)27.7%
Missing288311
Missing (%)99.1%
Memory size2.2 MiB
2025-01-08T18:40:21.148082image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length30
Median length27
Mean length7.081175106
Min length2

Characters and Unicode

Total characters18319
Distinct characters57
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique421 ?
Unique (%)16.3%

Sample

1st row1700 m.
2nd row± 100 Meter
3rd row± 100 m
4th rowasc 3000 ft
5th row7000'
ValueCountFrequency (%)
m 1564
30.9%
meter 212
 
4.2%
ft 177
 
3.5%
± 168
 
3.3%
6000 137
 
2.7%
7000 121
 
2.4%
1000 106
 
2.1%
900 102
 
2.0%
1800 101
 
2.0%
3000 101
 
2.0%
Other values (358) 2280
45.0%
2025-01-08T18:40:21.340180image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5678
31.0%
2483
13.6%
m 1262
 
6.9%
1 1022
 
5.6%
. 814
 
4.4%
5 685
 
3.7%
M 616
 
3.4%
e 596
 
3.3%
' 548
 
3.0%
2 519
 
2.8%
Other values (47) 4096
22.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10006
54.6%
Lowercase Letter 3329
 
18.2%
Space Separator 2483
 
13.6%
Other Punctuation 1432
 
7.8%
Uppercase Letter 663
 
3.6%
Math Symbol 202
 
1.1%
Dash Punctuation 194
 
1.1%
Open Punctuation 5
 
< 0.1%
Close Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m 1262
37.9%
e 596
17.9%
t 508
15.3%
r 274
 
8.2%
f 214
 
6.4%
a 97
 
2.9%
o 81
 
2.4%
s 62
 
1.9%
z 33
 
1.0%
l 29
 
0.9%
Other values (14) 173
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
M 616
92.9%
X 18
 
2.7%
F 9
 
1.4%
S 6
 
0.9%
E 3
 
0.5%
H 3
 
0.5%
K 2
 
0.3%
Y 2
 
0.3%
L 1
 
0.2%
V 1
 
0.2%
Other values (2) 2
 
0.3%
Decimal Number
ValueCountFrequency (%)
0 5678
56.7%
1 1022
 
10.2%
5 685
 
6.8%
2 519
 
5.2%
6 395
 
3.9%
7 387
 
3.9%
8 384
 
3.8%
4 355
 
3.5%
3 345
 
3.4%
9 236
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 814
56.8%
' 548
38.3%
, 66
 
4.6%
: 3
 
0.2%
/ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
± 196
97.0%
+ 6
 
3.0%
Space Separator
ValueCountFrequency (%)
2483
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 194
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 14327
78.2%
Latin 3992
 
21.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
m 1262
31.6%
M 616
15.4%
e 596
14.9%
t 508
12.7%
r 274
 
6.9%
f 214
 
5.4%
a 97
 
2.4%
o 81
 
2.0%
s 62
 
1.6%
z 33
 
0.8%
Other values (26) 249
 
6.2%
Common
ValueCountFrequency (%)
0 5678
39.6%
2483
17.3%
1 1022
 
7.1%
. 814
 
5.7%
5 685
 
4.8%
' 548
 
3.8%
2 519
 
3.6%
6 395
 
2.8%
7 387
 
2.7%
8 384
 
2.7%
Other values (11) 1412
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18121
98.9%
None 198
 
1.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5678
31.3%
2483
13.7%
m 1262
 
7.0%
1 1022
 
5.6%
. 814
 
4.5%
5 685
 
3.8%
M 616
 
3.4%
e 596
 
3.3%
' 548
 
3.0%
2 519
 
2.9%
Other values (45) 3898
21.5%
None
ValueCountFrequency (%)
± 196
99.0%
ü 2
 
1.0%

decimalLatitude
Text

Missing 

Distinct8176
Distinct (%)5.4%
Missing139112
Missing (%)47.8%
Memory size2.2 MiB
2025-01-08T18:40:21.534600image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length6.188172822
Min length3

Characters and Unicode

Total characters939278
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2575 ?
Unique (%)1.7%

Sample

1st row52.25
2nd row-35.8417
3rd row13.5
4th row-45.15267
5th row-13.4
ValueCountFrequency (%)
6.7667 1821
 
1.2%
52.2417 1258
 
0.8%
6.775 1114
 
0.7%
6.5833 1111
 
0.7%
52.175 953
 
0.6%
5.9417 878
 
0.6%
3.5917 852
 
0.6%
52.1 846
 
0.6%
53.3917 843
 
0.6%
52.3583 813
 
0.5%
Other values (7241) 141297
93.1%
2025-01-08T18:40:21.792655image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 151786
16.2%
5 137622
14.7%
3 107967
11.5%
1 87724
9.3%
2 84224
9.0%
7 76782
8.2%
8 60464
 
6.4%
6 56108
 
6.0%
0 51478
 
5.5%
4 48645
 
5.2%
Other values (2) 76478
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 742295
79.0%
Other Punctuation 151786
 
16.2%
Dash Punctuation 45197
 
4.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 137622
18.5%
3 107967
14.5%
1 87724
11.8%
2 84224
11.3%
7 76782
10.3%
8 60464
8.1%
6 56108
7.6%
0 51478
 
6.9%
4 48645
 
6.6%
9 31281
 
4.2%
Other Punctuation
ValueCountFrequency (%)
. 151786
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 45197
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 939278
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 151786
16.2%
5 137622
14.7%
3 107967
11.5%
1 87724
9.3%
2 84224
9.0%
7 76782
8.2%
8 60464
 
6.4%
6 56108
 
6.0%
0 51478
 
5.5%
4 48645
 
5.2%
Other values (2) 76478
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 939278
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 151786
16.2%
5 137622
14.7%
3 107967
11.5%
1 87724
9.3%
2 84224
9.0%
7 76782
8.2%
8 60464
 
6.4%
6 56108
 
6.0%
0 51478
 
5.5%
4 48645
 
5.2%
Other values (2) 76478
8.1%

decimalLongitude
Text

Missing 

Distinct9940
Distinct (%)6.5%
Missing139112
Missing (%)47.8%
Memory size2.2 MiB
2025-01-08T18:40:21.982692image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length10
Mean length6.289394279
Min length3

Characters and Unicode

Total characters954642
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3493 ?
Unique (%)2.3%

Sample

1st row4.5333
2nd row137.5083
3rd row-16.0
4th row169.89263
5th row48.27
ValueCountFrequency (%)
106.9167 1795
 
1.2%
107.0 1160
 
0.8%
106.925 1135
 
0.7%
106.8 1065
 
0.7%
4.875 997
 
0.7%
4.425 757
 
0.5%
124.8583 753
 
0.5%
98.675 723
 
0.5%
106.825 711
 
0.5%
6.1 699
 
0.5%
Other values (9112) 141991
93.5%
2025-01-08T18:40:22.233996image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 151786
15.9%
1 122157
12.8%
5 102551
10.7%
3 90578
9.5%
7 85418
8.9%
4 73953
7.7%
0 73376
7.7%
8 73201
7.7%
6 63754
6.7%
2 51747
 
5.4%
Other values (2) 66121
6.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 781858
81.9%
Other Punctuation 151786
 
15.9%
Dash Punctuation 20998
 
2.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 122157
15.6%
5 102551
13.1%
3 90578
11.6%
7 85418
10.9%
4 73953
9.5%
0 73376
9.4%
8 73201
9.4%
6 63754
8.2%
2 51747
6.6%
9 45123
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 151786
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20998
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 954642
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 151786
15.9%
1 122157
12.8%
5 102551
10.7%
3 90578
9.5%
7 85418
8.9%
4 73953
7.7%
0 73376
7.7%
8 73201
7.7%
6 63754
6.7%
2 51747
 
5.4%
Other values (2) 66121
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 954642
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 151786
15.9%
1 122157
12.8%
5 102551
10.7%
3 90578
9.5%
7 85418
8.9%
4 73953
7.7%
0 73376
7.7%
8 73201
7.7%
6 63754
6.7%
2 51747
 
5.4%
Other values (2) 66121
6.9%
Distinct173
Distinct (%)10.4%
Missing289239
Missing (%)99.4%
Memory size2.2 MiB
2025-01-08T18:40:22.393673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length7
Mean length5.42676311
Min length3

Characters and Unicode

Total characters9003
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)4.4%

Sample

1st row640000.0
2nd row20000.0
3rd row640000.0
4th row1000.0
5th row1.0
ValueCountFrequency (%)
5.0 399
24.1%
82230.0 131
 
7.9%
60697.0 87
 
5.2%
100.0 71
 
4.3%
216478.0 65
 
3.9%
1000.0 48
 
2.9%
2000.0 47
 
2.8%
200.0 41
 
2.5%
5196.0 40
 
2.4%
50.0 37
 
2.2%
Other values (163) 693
41.8%
2025-01-08T18:40:22.726996image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3474
38.6%
. 1659
18.4%
5 693
 
7.7%
2 585
 
6.5%
6 556
 
6.2%
1 437
 
4.9%
7 386
 
4.3%
4 331
 
3.7%
8 315
 
3.5%
3 303
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7344
81.6%
Other Punctuation 1659
 
18.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3474
47.3%
5 693
 
9.4%
2 585
 
8.0%
6 556
 
7.6%
1 437
 
6.0%
7 386
 
5.3%
4 331
 
4.5%
8 315
 
4.3%
3 303
 
4.1%
9 264
 
3.6%
Other Punctuation
ValueCountFrequency (%)
. 1659
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9003
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3474
38.6%
. 1659
18.4%
5 693
 
7.7%
2 585
 
6.5%
6 556
 
6.2%
1 437
 
4.9%
7 386
 
4.3%
4 331
 
3.7%
8 315
 
3.5%
3 303
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9003
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3474
38.6%
. 1659
18.4%
5 693
 
7.7%
2 585
 
6.5%
6 556
 
6.2%
1 437
 
4.9%
7 386
 
4.3%
4 331
 
3.7%
8 315
 
3.5%
3 303
 
3.4%

typeStatus
Text

Missing 

Distinct6
Distinct (%)0.2%
Missing287427
Missing (%)98.8%
Memory size2.2 MiB
2025-01-08T18:40:22.790819image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length7
Mean length7.703831749
Min length4

Characters and Unicode

Total characters26740
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSYNTYPE
2nd rowSYNTYPE
3rd rowSYNTYPE
4th rowPARATYPE
5th rowPARATYPE
ValueCountFrequency (%)
syntype 2278
65.6%
paratype 500
 
14.4%
holotype 369
 
10.6%
paralectotype 239
 
6.9%
lectotype 79
 
2.3%
type 6
 
0.2%
2025-01-08T18:40:22.888904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
Y 5749
21.5%
P 4210
15.7%
T 3789
14.2%
E 3789
14.2%
S 2278
 
8.5%
N 2278
 
8.5%
A 1478
 
5.5%
O 1056
 
3.9%
R 739
 
2.8%
L 687
 
2.6%
Other values (2) 687
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 26740
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
Y 5749
21.5%
P 4210
15.7%
T 3789
14.2%
E 3789
14.2%
S 2278
 
8.5%
N 2278
 
8.5%
A 1478
 
5.5%
O 1056
 
3.9%
R 739
 
2.8%
L 687
 
2.6%
Other values (2) 687
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 26740
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
Y 5749
21.5%
P 4210
15.7%
T 3789
14.2%
E 3789
14.2%
S 2278
 
8.5%
N 2278
 
8.5%
A 1478
 
5.5%
O 1056
 
3.9%
R 739
 
2.8%
L 687
 
2.6%
Other values (2) 687
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26740
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
Y 5749
21.5%
P 4210
15.7%
T 3789
14.2%
E 3789
14.2%
S 2278
 
8.5%
N 2278
 
8.5%
A 1478
 
5.5%
O 1056
 
3.9%
R 739
 
2.8%
L 687
 
2.6%
Other values (2) 687
 
2.6%

identifiedBy
Text

Missing 

Distinct48
Distinct (%)11.7%
Missing290486
Missing (%)99.9%
Memory size2.2 MiB
2025-01-08T18:40:22.984267image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length9
Mean length9.708737864
Min length4

Characters and Unicode

Total characters4000
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)5.1%

Sample

1st rowRijswijk C. van
2nd rowKonter A.
3rd rowKonter A.
4th rowVoous of Wattel?
5th rowVoous
ValueCountFrequency (%)
konter 165
20.0%
a 165
20.0%
dekker 113
13.7%
r 113
13.7%
voous 32
 
3.9%
roselaar 21
 
2.5%
jansen 11
 
1.3%
j.f.j 11
 
1.3%
k 11
 
1.3%
of 9
 
1.1%
Other values (72) 173
21.0%
2025-01-08T18:40:23.146374image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 527
13.2%
412
10.3%
. 408
10.2%
r 342
 
8.6%
o 283
 
7.1%
k 242
 
6.0%
n 218
 
5.5%
t 206
 
5.1%
K 184
 
4.6%
A 166
 
4.2%
Other values (48) 1012
25.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2283
57.1%
Uppercase Letter 833
 
20.8%
Other Punctuation 418
 
10.4%
Space Separator 412
 
10.3%
Decimal Number 48
 
1.2%
Open Punctuation 3
 
0.1%
Close Punctuation 3
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 527
23.1%
r 342
15.0%
o 283
12.4%
k 242
10.6%
n 218
9.5%
t 206
 
9.0%
a 108
 
4.7%
s 95
 
4.2%
l 60
 
2.6%
u 41
 
1.8%
Other values (13) 161
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
K 184
22.1%
A 166
19.9%
R 137
16.4%
D 121
14.5%
V 43
 
5.2%
J 36
 
4.3%
P 22
 
2.6%
S 19
 
2.3%
W 16
 
1.9%
C 15
 
1.8%
Other values (11) 74
8.9%
Decimal Number
ValueCountFrequency (%)
0 13
27.1%
1 11
22.9%
2 10
20.8%
3 8
16.7%
5 2
 
4.2%
9 2
 
4.2%
8 1
 
2.1%
4 1
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 408
97.6%
? 9
 
2.2%
& 1
 
0.2%
Space Separator
ValueCountFrequency (%)
412
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3116
77.9%
Common 884
 
22.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 527
16.9%
r 342
11.0%
o 283
9.1%
k 242
 
7.8%
n 218
 
7.0%
t 206
 
6.6%
K 184
 
5.9%
A 166
 
5.3%
R 137
 
4.4%
D 121
 
3.9%
Other values (34) 690
22.1%
Common
ValueCountFrequency (%)
412
46.6%
. 408
46.2%
0 13
 
1.5%
1 11
 
1.2%
2 10
 
1.1%
? 9
 
1.0%
3 8
 
0.9%
( 3
 
0.3%
) 3
 
0.3%
5 2
 
0.2%
Other values (4) 5
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 527
13.2%
412
10.3%
. 408
10.2%
r 342
 
8.6%
o 283
 
7.1%
k 242
 
6.0%
n 218
 
5.5%
t 206
 
5.1%
K 184
 
4.6%
A 166
 
4.2%
Other values (48) 1012
25.3%

dateIdentified
Text

Missing 

Distinct40
Distinct (%)15.6%
Missing290641
Missing (%)99.9%
Memory size2.2 MiB
2025-01-08T18:40:23.231102image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters4883
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)9.7%

Sample

1st row2022-07-01T00:00:00
2nd row2022-04-25T00:00:00
3rd row2022-04-25T00:00:00
4th row1964-01-01T00:00:00
5th row2022-04-25T00:00:00
ValueCountFrequency (%)
2022-04-25t00:00:00 165
64.2%
2018-05-31t00:00:00 13
 
5.1%
2021-07-01t00:00:00 11
 
4.3%
1964-01-01t00:00:00 10
 
3.9%
2014-10-28t00:00:00 7
 
2.7%
2014-10-20t00:00:00 4
 
1.6%
2023-12-28t00:00:00 3
 
1.2%
2022-08-31t00:00:00 3
 
1.2%
2017-04-17t00:00:00 3
 
1.2%
2023-01-01t00:00:00 3
 
1.2%
Other values (30) 35
 
13.6%
2025-01-08T18:40:23.405349image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 2092
42.8%
2 814
 
16.7%
- 514
 
10.5%
: 514
 
10.5%
T 257
 
5.3%
4 195
 
4.0%
5 184
 
3.8%
1 175
 
3.6%
8 38
 
0.8%
3 33
 
0.7%
Other values (3) 67
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3598
73.7%
Dash Punctuation 514
 
10.5%
Other Punctuation 514
 
10.5%
Uppercase Letter 257
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2092
58.1%
2 814
 
22.6%
4 195
 
5.4%
5 184
 
5.1%
1 175
 
4.9%
8 38
 
1.1%
3 33
 
0.9%
7 28
 
0.8%
9 23
 
0.6%
6 16
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 514
100.0%
Other Punctuation
ValueCountFrequency (%)
: 514
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 257
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4626
94.7%
Latin 257
 
5.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2092
45.2%
2 814
 
17.6%
- 514
 
11.1%
: 514
 
11.1%
4 195
 
4.2%
5 184
 
4.0%
1 175
 
3.8%
8 38
 
0.8%
3 33
 
0.7%
7 28
 
0.6%
Other values (2) 39
 
0.8%
Latin
ValueCountFrequency (%)
T 257
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4883
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2092
42.8%
2 814
 
16.7%
- 514
 
10.5%
: 514
 
10.5%
T 257
 
5.3%
4 195
 
4.0%
5 184
 
3.8%
1 175
 
3.6%
8 38
 
0.8%
3 33
 
0.7%
Other values (3) 67
 
1.4%
Distinct14746
Distinct (%)5.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-08T18:40:23.647964image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.988418581
Min length1

Characters and Unicode

Total characters2032910
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3066 ?
Unique (%)1.1%

Sample

1st row2484620
2nd row7342142
3rd row2479504
4th row6170652
5th row6170887
ValueCountFrequency (%)
5231191 1635
 
0.6%
6172874 1489
 
0.5%
6065824 1204
 
0.4%
6171845 1145
 
0.4%
2480242 1135
 
0.4%
7191198 1017
 
0.3%
7341902 981
 
0.3%
9156140 924
 
0.3%
7192432 869
 
0.3%
8990910 856
 
0.3%
Other values (14736) 279642
96.1%
2025-01-08T18:40:23.922940image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2032910
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2032910
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2032910
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%
Distinct15605
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:24.139276image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length73
Median length62
Mean length33.46860068
Min length4

Characters and Unicode

Total characters9735949
Distinct characters82
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3323 ?
Unique (%)1.1%

Sample

1st rowVidua orientalis Heuglin, 1870
2nd rowTurdus viscivorus viscivorus
3rd rowNeophema splendida (Gould, 1841)
4th rowPlatycercus elegans melanopterus North, 1906
5th rowPolytelis anthopeplus monarchoides Schodde, 1993
ValueCountFrequency (%)
linnaeus 44499
 
3.9%
1758 35006
 
3.1%
temminck 8993
 
0.8%
vieillot 8822
 
0.8%
1766 8577
 
0.8%
8565
 
0.8%
1789 7166
 
0.6%
1821 6706
 
0.6%
horsfield 6369
 
0.6%
gmelin 5943
 
0.5%
Other values (10019) 994615
87.6%
2025-01-08T18:40:24.523611image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
844363
 
8.7%
a 843067
 
8.7%
i 697116
 
7.2%
s 650497
 
6.7%
e 563060
 
5.8%
r 541457
 
5.6%
u 521696
 
5.4%
n 513245
 
5.3%
l 460926
 
4.7%
o 452646
 
4.6%
Other values (72) 3647876
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7140818
73.3%
Space Separator 844363
 
8.7%
Decimal Number 755884
 
7.8%
Uppercase Letter 534717
 
5.5%
Other Punctuation 240381
 
2.5%
Close Punctuation 109373
 
1.1%
Open Punctuation 109373
 
1.1%
Dash Punctuation 1039
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 843067
11.8%
i 697116
9.8%
s 650497
9.1%
e 563060
 
7.9%
r 541457
 
7.6%
u 521696
 
7.3%
n 513245
 
7.2%
l 460926
 
6.5%
o 452646
 
6.3%
c 359670
 
5.0%
Other values (25) 1537438
21.5%
Uppercase Letter
ValueCountFrequency (%)
L 78527
14.7%
P 55836
10.4%
C 49872
9.3%
S 46252
 
8.6%
A 38918
 
7.3%
G 32535
 
6.1%
T 32243
 
6.0%
M 27609
 
5.2%
B 26545
 
5.0%
H 26230
 
4.9%
Other values (17) 120150
22.5%
Decimal Number
ValueCountFrequency (%)
1 228842
30.3%
8 165376
21.9%
7 99308
13.1%
5 54960
 
7.3%
9 44245
 
5.9%
6 43057
 
5.7%
2 36834
 
4.9%
3 33830
 
4.5%
0 26848
 
3.6%
4 22584
 
3.0%
Other Punctuation
ValueCountFrequency (%)
, 188988
78.6%
. 42650
 
17.7%
& 8562
 
3.6%
' 178
 
0.1%
? 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
844363
100.0%
Close Punctuation
ValueCountFrequency (%)
) 109373
100.0%
Open Punctuation
ValueCountFrequency (%)
( 109373
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1039
100.0%
Math Symbol
ValueCountFrequency (%)
× 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7675535
78.8%
Common 2060414
 
21.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 843067
11.0%
i 697116
 
9.1%
s 650497
 
8.5%
e 563060
 
7.3%
r 541457
 
7.1%
u 521696
 
6.8%
n 513245
 
6.7%
l 460926
 
6.0%
o 452646
 
5.9%
c 359670
 
4.7%
Other values (52) 2072155
27.0%
Common
ValueCountFrequency (%)
844363
41.0%
1 228842
 
11.1%
, 188988
 
9.2%
8 165376
 
8.0%
) 109373
 
5.3%
( 109373
 
5.3%
7 99308
 
4.8%
5 54960
 
2.7%
9 44245
 
2.1%
6 43057
 
2.1%
Other values (10) 172529
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9732344
> 99.9%
None 3605
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
844363
 
8.7%
a 843067
 
8.7%
i 697116
 
7.2%
s 650497
 
6.7%
e 563060
 
5.8%
r 541457
 
5.6%
u 521696
 
5.4%
n 513245
 
5.3%
l 460926
 
4.7%
o 452646
 
4.7%
Other values (61) 3644271
37.4%
None
ValueCountFrequency (%)
ü 2204
61.1%
ø 470
 
13.0%
é 383
 
10.6%
ä 257
 
7.1%
á 181
 
5.0%
è 40
 
1.1%
É 38
 
1.1%
ö 14
 
0.4%
ë 12
 
0.3%
ñ 5
 
0.1%
Distinct310
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:24.663738image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length43
Mean length16.58477886
Min length8

Characters and Unicode

Total characters4824479
Distinct characters52
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia|Viduidae
2nd rowAnimalia|Turdidae
3rd rowAnimalia|Psittacidae
4th rowAnimalia|Psittacidae
5th rowAnimalia|Psittacidae
ValueCountFrequency (%)
animalia 74175
25.4%
animalia|turdidae 13154
 
4.5%
animalia|scolopacidae 11012
 
3.8%
animalia|sylviidae 10286
 
3.5%
animalia|emberizidae 8024
 
2.8%
animalia|fringillidae 7443
 
2.6%
animalia|corvidae 7140
 
2.4%
animalia|ardeidae 5218
 
1.8%
animalia|timaliidae 5010
 
1.7%
animalia|charadriidae 4907
 
1.7%
Other values (298) 145238
49.8%
2025-01-08T18:40:24.873850image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 936776
19.4%
a 917961
19.0%
l 393768
8.2%
n 357847
 
7.4%
m 319003
 
6.6%
A 316570
 
6.6%
e 276152
 
5.7%
d 261177
 
5.4%
| 221132
 
4.6%
r 138033
 
2.9%
Other values (42) 686060
14.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4088471
84.7%
Uppercase Letter 513434
 
10.6%
Math Symbol 221132
 
4.6%
Other Punctuation 733
 
< 0.1%
Space Separator 709
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 936776
22.9%
a 917961
22.5%
l 393768
9.6%
n 357847
 
8.8%
m 319003
 
7.8%
e 276152
 
6.8%
d 261177
 
6.4%
r 138033
 
3.4%
c 98653
 
2.4%
o 93715
 
2.3%
Other values (13) 295386
 
7.2%
Uppercase Letter
ValueCountFrequency (%)
A 316570
61.7%
P 35114
 
6.8%
T 32092
 
6.3%
S 31063
 
6.1%
C 24935
 
4.9%
M 14619
 
2.8%
E 13061
 
2.5%
F 11340
 
2.2%
L 6846
 
1.3%
N 4455
 
0.9%
Other values (12) 23339
 
4.5%
Other Punctuation
ValueCountFrequency (%)
: 679
92.6%
? 39
 
5.3%
/ 12
 
1.6%
, 2
 
0.3%
. 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
| 221132
100.0%
Space Separator
ValueCountFrequency (%)
709
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4601905
95.4%
Common 222574
 
4.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 936776
20.4%
a 917961
19.9%
l 393768
8.6%
n 357847
 
7.8%
m 319003
 
6.9%
A 316570
 
6.9%
e 276152
 
6.0%
d 261177
 
5.7%
r 138033
 
3.0%
c 98653
 
2.1%
Other values (35) 585965
12.7%
Common
ValueCountFrequency (%)
| 221132
99.4%
709
 
0.3%
: 679
 
0.3%
? 39
 
< 0.1%
/ 12
 
< 0.1%
, 2
 
< 0.1%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4824479
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 936776
19.4%
a 917961
19.0%
l 393768
8.2%
n 357847
 
7.4%
m 319003
 
6.6%
A 316570
 
6.6%
e 276152
 
5.7%
d 261177
 
5.4%
| 221132
 
4.6%
r 138033
 
2.9%
Other values (42) 686060
14.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:24.949714image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length8
Mean length8.000020626
Min length8

Characters and Unicode

Total characters2327190
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 290897
> 99.9%
incertae 1
 
< 0.1%
sedis 1
 
< 0.1%
2025-01-08T18:40:25.097009image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 581796
25.0%
a 581795
25.0%
n 290898
12.5%
A 290897
12.5%
m 290897
12.5%
l 290897
12.5%
e 3
 
< 0.1%
s 2
 
< 0.1%
c 1
 
< 0.1%
r 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2036292
87.5%
Uppercase Letter 290897
 
12.5%
Space Separator 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 581796
28.6%
a 581795
28.6%
n 290898
14.3%
m 290897
14.3%
l 290897
14.3%
e 3
 
< 0.1%
s 2
 
< 0.1%
c 1
 
< 0.1%
r 1
 
< 0.1%
t 1
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
A 290897
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2327189
> 99.9%
Common 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 581796
25.0%
a 581795
25.0%
n 290898
12.5%
A 290897
12.5%
m 290897
12.5%
l 290897
12.5%
e 3
 
< 0.1%
s 2
 
< 0.1%
c 1
 
< 0.1%
r 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2327190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 581796
25.0%
a 581795
25.0%
n 290898
12.5%
A 290897
12.5%
m 290897
12.5%
l 290897
12.5%
e 3
 
< 0.1%
s 2
 
< 0.1%
c 1
 
< 0.1%
r 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

phylum
Text

Distinct3
Distinct (%)< 0.1%
Missing766
Missing (%)0.3%
Memory size2.2 MiB
2025-01-08T18:40:25.141758image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length8.000461859
Min length8

Characters and Unicode

Total characters2321190
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 290063
> 99.9%
arthropoda 67
 
< 0.1%
mollusca 2
 
< 0.1%
2025-01-08T18:40:25.251248image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 580195
25.0%
o 290199
12.5%
r 290197
12.5%
h 290130
12.5%
d 290130
12.5%
t 290130
12.5%
C 290063
12.5%
A 67
 
< 0.1%
p 67
 
< 0.1%
l 4
 
< 0.1%
Other values (4) 8
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2031058
87.5%
Uppercase Letter 290132
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 580195
28.6%
o 290199
14.3%
r 290197
14.3%
h 290130
14.3%
d 290130
14.3%
t 290130
14.3%
p 67
 
< 0.1%
l 4
 
< 0.1%
u 2
 
< 0.1%
s 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
C 290063
> 99.9%
A 67
 
< 0.1%
M 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 2321190
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 580195
25.0%
o 290199
12.5%
r 290197
12.5%
h 290130
12.5%
d 290130
12.5%
t 290130
12.5%
C 290063
12.5%
A 67
 
< 0.1%
p 67
 
< 0.1%
l 4
 
< 0.1%
Other values (4) 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2321190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 580195
25.0%
o 290199
12.5%
r 290197
12.5%
h 290130
12.5%
d 290130
12.5%
t 290130
12.5%
C 290063
12.5%
A 67
 
< 0.1%
p 67
 
< 0.1%
l 4
 
< 0.1%
Other values (4) 8
 
< 0.1%

class
Text

Distinct6
Distinct (%)< 0.1%
Missing770
Missing (%)0.3%
Memory size2.2 MiB
2025-01-08T18:40:25.298060image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length4
Mean length4.000820328
Min length4

Characters and Unicode

Total characters1160750
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowAves
2nd rowAves
3rd rowAves
4th rowAves
5th rowAves
ValueCountFrequency (%)
aves 290053
> 99.9%
insecta 66
 
< 0.1%
mammalia 6
 
< 0.1%
bivalvia 1
 
< 0.1%
squamata 1
 
< 0.1%
malacostraca 1
 
< 0.1%
2025-01-08T18:40:25.407974image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 290120
25.0%
e 290119
25.0%
v 290055
25.0%
A 290053
25.0%
a 93
 
< 0.1%
c 68
 
< 0.1%
t 68
 
< 0.1%
I 66
 
< 0.1%
n 66
 
< 0.1%
m 13
 
< 0.1%
Other values (9) 29
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 870622
75.0%
Uppercase Letter 290128
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 290120
33.3%
e 290119
33.3%
v 290055
33.3%
a 93
 
< 0.1%
c 68
 
< 0.1%
t 68
 
< 0.1%
n 66
 
< 0.1%
m 13
 
< 0.1%
i 8
 
< 0.1%
l 8
 
< 0.1%
Other values (4) 4
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
A 290053
> 99.9%
I 66
 
< 0.1%
M 7
 
< 0.1%
B 1
 
< 0.1%
S 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 1160750
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 290120
25.0%
e 290119
25.0%
v 290055
25.0%
A 290053
25.0%
a 93
 
< 0.1%
c 68
 
< 0.1%
t 68
 
< 0.1%
I 66
 
< 0.1%
n 66
 
< 0.1%
m 13
 
< 0.1%
Other values (9) 29
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1160750
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 290120
25.0%
e 290119
25.0%
v 290055
25.0%
A 290053
25.0%
a 93
 
< 0.1%
c 68
 
< 0.1%
t 68
 
< 0.1%
I 66
 
< 0.1%
n 66
 
< 0.1%
m 13
 
< 0.1%
Other values (9) 29
 
< 0.1%

order
Text

Distinct53
Distinct (%)< 0.1%
Missing1492
Missing (%)0.5%
Memory size2.2 MiB
2025-01-08T18:40:25.478415image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length13
Mean length13.08270388
Min length7

Characters and Unicode

Total characters3786213
Distinct characters36
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowPasseriformes
2nd rowPasseriformes
3rd rowPsittaciformes
4th rowPsittaciformes
5th rowPsittaciformes
ValueCountFrequency (%)
passeriformes 145670
50.3%
charadriiformes 33385
 
11.5%
accipitriformes 10340
 
3.6%
anseriformes 10163
 
3.5%
columbiformes 9902
 
3.4%
piciformes 8355
 
2.9%
galliformes 7462
 
2.6%
apodiformes 7409
 
2.6%
pelecaniformes 7141
 
2.5%
coraciiformes 7019
 
2.4%
Other values (43) 42560
 
14.7%
2025-01-08T18:40:25.614950image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 598367
15.8%
r 550123
14.5%
e 466963
12.3%
i 383264
10.1%
o 327105
8.6%
m 301232
8.0%
f 289334
7.6%
a 250700
6.6%
P 173212
 
4.6%
c 67656
 
1.8%
Other values (26) 378257
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3496807
92.4%
Uppercase Letter 289406
 
7.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 598367
17.1%
r 550123
15.7%
e 466963
13.4%
i 383264
11.0%
o 327105
9.4%
m 301232
8.6%
f 289334
8.3%
a 250700
7.2%
c 67656
 
1.9%
l 50713
 
1.5%
Other values (10) 211350
 
6.0%
Uppercase Letter
ValueCountFrequency (%)
P 173212
59.9%
C 58998
 
20.4%
A 27958
 
9.7%
G 14500
 
5.0%
S 7509
 
2.6%
F 4075
 
1.4%
B 1360
 
0.5%
T 1048
 
0.4%
O 266
 
0.1%
M 226
 
0.1%
Other values (6) 254
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 3786213
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 598367
15.8%
r 550123
14.5%
e 466963
12.3%
i 383264
10.1%
o 327105
8.6%
m 301232
8.0%
f 289334
7.6%
a 250700
6.6%
P 173212
 
4.6%
c 67656
 
1.8%
Other values (26) 378257
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3786213
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 598367
15.8%
r 550123
14.5%
e 466963
12.3%
i 383264
10.1%
o 327105
8.6%
m 301232
8.0%
f 289334
7.6%
a 250700
6.6%
P 173212
 
4.6%
c 67656
 
1.8%
Other values (26) 378257
10.0%

family
Text

Distinct250
Distinct (%)0.1%
Missing1542
Missing (%)0.5%
Memory size2.2 MiB
2025-01-08T18:40:25.737069image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length10.45621311
Min length7

Characters and Unicode

Total characters3025568
Distinct characters42
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st rowViduidae
2nd rowTurdidae
3rd rowPsittacidae
4th rowPsittacidae
5th rowPsittacidae
ValueCountFrequency (%)
scolopacidae 11999
 
4.1%
muscicapidae 11304
 
3.9%
anatidae 10104
 
3.5%
accipitridae 9920
 
3.4%
columbidae 9902
 
3.4%
laridae 9646
 
3.3%
fringillidae 8978
 
3.1%
corvidae 7653
 
2.6%
turdidae 6987
 
2.4%
psittacidae 6737
 
2.3%
Other values (240) 196126
67.8%
2025-01-08T18:40:25.921890image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 466305
15.4%
a 457770
15.1%
e 358881
11.9%
d 337256
11.1%
r 160715
 
5.3%
c 157588
 
5.2%
l 139993
 
4.6%
o 124575
 
4.1%
t 93283
 
3.1%
n 90718
 
3.0%
Other values (32) 638484
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2736212
90.4%
Uppercase Letter 289356
 
9.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 466305
17.0%
a 457770
16.7%
e 358881
13.1%
d 337256
12.3%
r 160715
 
5.9%
c 157588
 
5.8%
l 139993
 
5.1%
o 124575
 
4.6%
t 93283
 
3.4%
n 90718
 
3.3%
Other values (11) 349128
12.8%
Uppercase Letter
ValueCountFrequency (%)
P 52163
18.0%
A 42928
14.8%
C 41817
14.5%
T 29194
10.1%
S 26989
9.3%
M 24976
8.6%
L 14587
 
5.0%
F 14496
 
5.0%
R 9279
 
3.2%
E 7381
 
2.6%
Other values (11) 25546
8.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 3025568
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 466305
15.4%
a 457770
15.1%
e 358881
11.9%
d 337256
11.1%
r 160715
 
5.3%
c 157588
 
5.2%
l 139993
 
4.6%
o 124575
 
4.1%
t 93283
 
3.1%
n 90718
 
3.0%
Other values (32) 638484
21.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3025568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 466305
15.4%
a 457770
15.1%
e 358881
11.9%
d 337256
11.1%
r 160715
 
5.3%
c 157588
 
5.2%
l 139993
 
4.6%
o 124575
 
4.1%
t 93283
 
3.1%
n 90718
 
3.0%
Other values (32) 638484
21.1%

genus
Text

Distinct2192
Distinct (%)0.8%
Missing1404
Missing (%)0.5%
Memory size2.2 MiB
2025-01-08T18:40:26.117620image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length15
Mean length8.287798711
Min length3

Characters and Unicode

Total characters2399268
Distinct characters52
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique157 ?
Unique (%)0.1%

Sample

1st rowVidua
2nd rowTurdus
3rd rowNeophema
4th rowPlatycercus
5th rowPolytelis
ValueCountFrequency (%)
turdus 5647
 
2.0%
calidris 3893
 
1.3%
falco 3593
 
1.2%
pycnonotus 3364
 
1.2%
passer 3110
 
1.1%
accipiter 2980
 
1.0%
sylvia 2742
 
0.9%
emberiza 2716
 
0.9%
larus 2634
 
0.9%
corvus 2475
 
0.9%
Other values (2182) 256340
88.5%
2025-01-08T18:40:26.375831image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 243380
 
10.1%
s 190746
 
8.0%
i 188999
 
7.9%
r 183484
 
7.6%
o 182529
 
7.6%
u 165872
 
6.9%
l 135576
 
5.7%
e 130021
 
5.4%
c 113946
 
4.7%
n 107840
 
4.5%
Other values (42) 756875
31.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2109774
87.9%
Uppercase Letter 289494
 
12.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 243380
11.5%
s 190746
9.0%
i 188999
9.0%
r 183484
 
8.7%
o 182529
 
8.7%
u 165872
 
7.9%
l 135576
 
6.4%
e 130021
 
6.2%
c 113946
 
5.4%
n 107840
 
5.1%
Other values (16) 467381
22.2%
Uppercase Letter
ValueCountFrequency (%)
P 46208
16.0%
C 45802
15.8%
A 32702
11.3%
T 22547
7.8%
S 22237
7.7%
M 17848
 
6.2%
L 17759
 
6.1%
G 11587
 
4.0%
E 10390
 
3.6%
F 9152
 
3.2%
Other values (16) 53262
18.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 2399268
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 243380
 
10.1%
s 190746
 
8.0%
i 188999
 
7.9%
r 183484
 
7.6%
o 182529
 
7.6%
u 165872
 
6.9%
l 135576
 
5.7%
e 130021
 
5.4%
c 113946
 
4.7%
n 107840
 
4.5%
Other values (42) 756875
31.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2399268
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 243380
 
10.1%
s 190746
 
8.0%
i 188999
 
7.9%
r 183484
 
7.6%
o 182529
 
7.6%
u 165872
 
6.9%
l 135576
 
5.7%
e 130021
 
5.4%
c 113946
 
4.7%
n 107840
 
4.5%
Other values (42) 756875
31.5%
Distinct2287
Distinct (%)0.8%
Missing1636
Missing (%)0.6%
Memory size2.2 MiB
2025-01-08T18:40:26.570668image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length15
Mean length8.139814424
Min length1

Characters and Unicode

Total characters2354539
Distinct characters54
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique199 ?
Unique (%)0.1%

Sample

1st rowVidua
2nd rowTurdus
3rd rowNeophema
4th rowPlatycercus
5th rowPolytelis
ValueCountFrequency (%)
turdus 5646
 
2.0%
larus 4361
 
1.5%
falco 3593
 
1.2%
parus 3587
 
1.2%
corvus 3377
 
1.2%
pycnonotus 3358
 
1.2%
sterna 3238
 
1.1%
passer 3110
 
1.1%
anas 2998
 
1.0%
accipiter 2980
 
1.0%
Other values (2277) 253014
87.5%
2025-01-08T18:40:26.821011image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 248614
 
10.6%
r 185008
 
7.9%
s 182957
 
7.8%
i 178481
 
7.6%
o 170638
 
7.2%
u 166953
 
7.1%
e 131390
 
5.6%
l 131371
 
5.6%
c 112832
 
4.8%
t 105857
 
4.5%
Other values (44) 740438
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2065277
87.7%
Uppercase Letter 289259
 
12.3%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 248614
12.0%
r 185008
9.0%
s 182957
8.9%
i 178481
 
8.6%
o 170638
 
8.3%
u 166953
 
8.1%
e 131390
 
6.4%
l 131371
 
6.4%
c 112832
 
5.5%
t 105857
 
5.1%
Other values (17) 451176
21.8%
Uppercase Letter
ValueCountFrequency (%)
P 46469
16.1%
C 41880
14.5%
A 34789
12.0%
S 21995
7.6%
T 20919
 
7.2%
M 18710
 
6.5%
L 17139
 
5.9%
E 12134
 
4.2%
D 10202
 
3.5%
H 10049
 
3.5%
Other values (16) 54973
19.0%
Other Punctuation
ValueCountFrequency (%)
? 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2354536
> 99.9%
Common 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 248614
 
10.6%
r 185008
 
7.9%
s 182957
 
7.8%
i 178481
 
7.6%
o 170638
 
7.2%
u 166953
 
7.1%
e 131390
 
5.6%
l 131371
 
5.6%
c 112832
 
4.8%
t 105857
 
4.5%
Other values (43) 740435
31.4%
Common
ValueCountFrequency (%)
? 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2354527
> 99.9%
None 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 248614
 
10.6%
r 185008
 
7.9%
s 182957
 
7.8%
i 178481
 
7.6%
o 170638
 
7.2%
u 166953
 
7.1%
e 131390
 
5.6%
l 131371
 
5.6%
c 112832
 
4.8%
t 105857
 
4.5%
Other values (43) 740426
31.4%
None
ValueCountFrequency (%)
ë 12
100.0%

specificEpithet
Text

Missing 

Distinct4206
Distinct (%)1.5%
Missing10799
Missing (%)3.7%
Memory size2.2 MiB
2025-01-08T18:40:27.022960image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length14
Mean length8.536481744
Min length3

Characters and Unicode

Total characters2391060
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique404 ?
Unique (%)0.1%

Sample

1st roworientalis
2nd rowviscivorus
3rd rowsplendida
4th rowelegans
5th rowanthopeplus
ValueCountFrequency (%)
alba 2049
 
0.7%
major 1951
 
0.7%
domesticus 1907
 
0.7%
cinerea 1808
 
0.6%
vulgaris 1734
 
0.6%
montanus 1697
 
0.6%
chloris 1575
 
0.6%
striata 1540
 
0.5%
chinensis 1514
 
0.5%
cristatus 1484
 
0.5%
Other values (4196) 262840
93.8%
2025-01-08T18:40:27.288472image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 296714
12.4%
i 234928
9.8%
s 232760
9.7%
u 185091
 
7.7%
r 177046
 
7.4%
e 163697
 
6.8%
l 154142
 
6.4%
n 147702
 
6.2%
c 139563
 
5.8%
o 136297
 
5.7%
Other values (16) 523120
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2391060
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 296714
12.4%
i 234928
9.8%
s 232760
9.7%
u 185091
 
7.7%
r 177046
 
7.4%
e 163697
 
6.8%
l 154142
 
6.4%
n 147702
 
6.2%
c 139563
 
5.8%
o 136297
 
5.7%
Other values (16) 523120
21.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 2391060
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 296714
12.4%
i 234928
9.8%
s 232760
9.7%
u 185091
 
7.7%
r 177046
 
7.4%
e 163697
 
6.8%
l 154142
 
6.4%
n 147702
 
6.2%
c 139563
 
5.8%
o 136297
 
5.7%
Other values (16) 523120
21.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2391060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 296714
12.4%
i 234928
9.8%
s 232760
9.7%
u 185091
 
7.7%
r 177046
 
7.4%
e 163697
 
6.8%
l 154142
 
6.4%
n 147702
 
6.2%
c 139563
 
5.8%
o 136297
 
5.7%
Other values (16) 523120
21.9%

infraspecificEpithet
Text

Missing 

Distinct5180
Distinct (%)3.1%
Missing125699
Missing (%)43.2%
Memory size2.2 MiB
2025-01-08T18:40:27.486273image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length14
Mean length8.541080757
Min length3

Characters and Unicode

Total characters1410978
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique832 ?
Unique (%)0.5%

Sample

1st rowviscivorus
2nd rowmelanopterus
3rd rowmonarchoides
4th rowrubescens
5th rowmeridionalis
ValueCountFrequency (%)
domesticus 2283
 
1.4%
vulgaris 1490
 
0.9%
merula 1145
 
0.7%
cinerea 1131
 
0.7%
nisus 1017
 
0.6%
glandarius 981
 
0.6%
montanus 946
 
0.6%
coelebs 924
 
0.6%
major 924
 
0.6%
cristatus 879
 
0.5%
Other values (5170) 153479
92.9%
2025-01-08T18:40:27.749047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 165124
11.7%
i 150040
10.6%
s 140413
10.0%
r 106309
 
7.5%
u 103801
 
7.4%
e 102944
 
7.3%
n 92129
 
6.5%
l 86050
 
6.1%
o 79489
 
5.6%
c 77292
 
5.5%
Other values (16) 307387
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1410978
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 165124
11.7%
i 150040
10.6%
s 140413
10.0%
r 106309
 
7.5%
u 103801
 
7.4%
e 102944
 
7.3%
n 92129
 
6.5%
l 86050
 
6.1%
o 79489
 
5.6%
c 77292
 
5.5%
Other values (16) 307387
21.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 1410978
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 165124
11.7%
i 150040
10.6%
s 140413
10.0%
r 106309
 
7.5%
u 103801
 
7.4%
e 102944
 
7.3%
n 92129
 
6.5%
l 86050
 
6.1%
o 79489
 
5.6%
c 77292
 
5.5%
Other values (16) 307387
21.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1410978
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 165124
11.7%
i 150040
10.6%
s 140413
10.0%
r 106309
 
7.5%
u 103801
 
7.4%
e 102944
 
7.3%
n 92129
 
6.5%
l 86050
 
6.1%
o 79489
 
5.6%
c 77292
 
5.5%
Other values (16) 307387
21.8%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:27.812046image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length8.635483915
Min length4

Characters and Unicode

Total characters2512045
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowSPECIES
2nd rowSUBSPECIES
3rd rowSPECIES
4th rowSUBSPECIES
5th rowSUBSPECIES
ValueCountFrequency (%)
subspecies 165197
56.8%
species 115131
39.6%
genus 9163
 
3.1%
class 538
 
0.2%
kingdom 486
 
0.2%
family 329
 
0.1%
order 47
 
< 0.1%
unranked 3
 
< 0.1%
form 3
 
< 0.1%
phylum 1
 
< 0.1%
2025-01-08T18:40:27.922573image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 736092
29.3%
E 569869
22.7%
I 281143
 
11.2%
C 280866
 
11.2%
P 280329
 
11.2%
U 174364
 
6.9%
B 165197
 
6.6%
N 9655
 
0.4%
G 9649
 
0.4%
A 870
 
< 0.1%
Other values (9) 4011
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2512045
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 736092
29.3%
E 569869
22.7%
I 281143
 
11.2%
C 280866
 
11.2%
P 280329
 
11.2%
U 174364
 
6.9%
B 165197
 
6.6%
N 9655
 
0.4%
G 9649
 
0.4%
A 870
 
< 0.1%
Other values (9) 4011
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 2512045
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 736092
29.3%
E 569869
22.7%
I 281143
 
11.2%
C 280866
 
11.2%
P 280329
 
11.2%
U 174364
 
6.9%
B 165197
 
6.6%
N 9655
 
0.4%
G 9649
 
0.4%
A 870
 
< 0.1%
Other values (9) 4011
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2512045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 736092
29.3%
E 569869
22.7%
I 281143
 
11.2%
C 280866
 
11.2%
P 280329
 
11.2%
U 174364
 
6.9%
B 165197
 
6.6%
N 9655
 
0.4%
G 9649
 
0.4%
A 870
 
< 0.1%
Other values (9) 4011
 
0.2%

nomenclaturalCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:27.963230image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters1163592
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowICZN
2nd rowICZN
3rd rowICZN
4th rowICZN
5th rowICZN
ValueCountFrequency (%)
iczn 290898
100.0%
2025-01-08T18:40:28.053328image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 290898
25.0%
C 290898
25.0%
Z 290898
25.0%
N 290898
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1163592
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 290898
25.0%
C 290898
25.0%
Z 290898
25.0%
N 290898
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1163592
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 290898
25.0%
C 290898
25.0%
Z 290898
25.0%
N 290898
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1163592
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 290898
25.0%
C 290898
25.0%
Z 290898
25.0%
N 290898
25.0%
Distinct3
Distinct (%)< 0.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-08T18:40:28.094327image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length7.832105522
Min length7

Characters and Unicode

Total characters2278336
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowACCEPTED
2nd rowACCEPTED
3rd rowACCEPTED
4th rowACCEPTED
5th rowACCEPTED
ValueCountFrequency (%)
accepted 239957
82.5%
synonym 48840
 
16.8%
doubtful 2100
 
0.7%
2025-01-08T18:40:28.194997image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 479914
21.1%
E 479914
21.1%
T 242057
10.6%
D 242057
10.6%
A 239957
10.5%
P 239957
10.5%
Y 97680
 
4.3%
N 97680
 
4.3%
O 50940
 
2.2%
S 48840
 
2.1%
Other values (5) 59340
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2278336
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 479914
21.1%
E 479914
21.1%
T 242057
10.6%
D 242057
10.6%
A 239957
10.5%
P 239957
10.5%
Y 97680
 
4.3%
N 97680
 
4.3%
O 50940
 
2.2%
S 48840
 
2.1%
Other values (5) 59340
 
2.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 2278336
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 479914
21.1%
E 479914
21.1%
T 242057
10.6%
D 242057
10.6%
A 239957
10.5%
P 239957
10.5%
Y 97680
 
4.3%
N 97680
 
4.3%
O 50940
 
2.2%
S 48840
 
2.1%
Other values (5) 59340
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2278336
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 479914
21.1%
E 479914
21.1%
T 242057
10.6%
D 242057
10.6%
A 239957
10.5%
P 239957
10.5%
Y 97680
 
4.3%
N 97680
 
4.3%
O 50940
 
2.2%
S 48840
 
2.1%
Other values (5) 59340
 
2.6%

datasetKey
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:28.247590image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters10472328
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row889c91a3-614f-4355-8df8-b6d0260a118c
2nd row889c91a3-614f-4355-8df8-b6d0260a118c
3rd row889c91a3-614f-4355-8df8-b6d0260a118c
4th row889c91a3-614f-4355-8df8-b6d0260a118c
5th row889c91a3-614f-4355-8df8-b6d0260a118c
ValueCountFrequency (%)
889c91a3-614f-4355-8df8-b6d0260a118c 290898
100.0%
2025-01-08T18:40:28.352098image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 1454490
13.9%
1 1163592
11.1%
- 1163592
11.1%
6 872694
 
8.3%
9 581796
 
5.6%
c 581796
 
5.6%
a 581796
 
5.6%
3 581796
 
5.6%
4 581796
 
5.6%
f 581796
 
5.6%
Other values (5) 2327184
22.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6690654
63.9%
Lowercase Letter 2618082
 
25.0%
Dash Punctuation 1163592
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 1454490
21.7%
1 1163592
17.4%
6 872694
13.0%
9 581796
 
8.7%
3 581796
 
8.7%
4 581796
 
8.7%
5 581796
 
8.7%
0 581796
 
8.7%
2 290898
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
c 581796
22.2%
a 581796
22.2%
f 581796
22.2%
d 581796
22.2%
b 290898
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 1163592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7854246
75.0%
Latin 2618082
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 1454490
18.5%
1 1163592
14.8%
- 1163592
14.8%
6 872694
11.1%
9 581796
 
7.4%
3 581796
 
7.4%
4 581796
 
7.4%
5 581796
 
7.4%
0 581796
 
7.4%
2 290898
 
3.7%
Latin
ValueCountFrequency (%)
c 581796
22.2%
a 581796
22.2%
f 581796
22.2%
d 581796
22.2%
b 290898
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10472328
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 1454490
13.9%
1 1163592
11.1%
- 1163592
11.1%
6 872694
 
8.3%
9 581796
 
5.6%
c 581796
 
5.6%
a 581796
 
5.6%
3 581796
 
5.6%
4 581796
 
5.6%
f 581796
 
5.6%
Other values (5) 2327184
22.2%

publishingCountry
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:28.393096image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters581796
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNL
2nd rowNL
3rd rowNL
4th rowNL
5th rowNL
ValueCountFrequency (%)
nl 290898
100.0%
2025-01-08T18:40:28.484486image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 290898
50.0%
L 290898
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 581796
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 290898
50.0%
L 290898
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 581796
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 290898
50.0%
L 290898
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 581796
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 290898
50.0%
L 290898
50.0%
Distinct24995
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:28.589868image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99487105
Min length20

Characters and Unicode

Total characters6980060
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2133 ?
Unique (%)0.7%

Sample

1st row2025-01-03T11:41:38.952Z
2nd row2025-01-03T11:41:39.036Z
3rd row2025-01-03T11:41:41.369Z
4th row2025-01-03T11:41:41.370Z
5th row2025-01-03T11:41:41.379Z
ValueCountFrequency (%)
2025-01-03t11:42:05.126z 149
 
0.1%
2025-01-03t11:42:05.124z 149
 
0.1%
2025-01-03t11:42:05.005z 148
 
0.1%
2025-01-03t11:42:05.125z 146
 
0.1%
2025-01-03t11:42:05.127z 145
 
< 0.1%
2025-01-03t11:42:05.122z 145
 
< 0.1%
2025-01-03t11:42:05.010z 142
 
< 0.1%
2025-01-03t11:42:04.999z 141
 
< 0.1%
2025-01-03t11:42:05.042z 139
 
< 0.1%
2025-01-03t11:42:04.998z 138
 
< 0.1%
Other values (24985) 289456
99.5%
2025-01-08T18:40:28.763480image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1161663
16.6%
0 1129747
16.2%
2 861543
12.3%
- 581796
8.3%
: 581796
8.3%
5 506791
7.3%
4 439261
 
6.3%
3 387300
 
5.5%
T 290898
 
4.2%
Z 290898
 
4.2%
Other values (5) 748367
10.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4944147
70.8%
Other Punctuation 872321
 
12.5%
Dash Punctuation 581796
 
8.3%
Uppercase Letter 581796
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1161663
23.5%
0 1129747
22.9%
2 861543
17.4%
5 506791
10.3%
4 439261
 
8.9%
3 387300
 
7.8%
8 122218
 
2.5%
6 117907
 
2.4%
9 114950
 
2.3%
7 102767
 
2.1%
Other Punctuation
ValueCountFrequency (%)
: 581796
66.7%
. 290525
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 581796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6398264
91.7%
Latin 581796
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1161663
18.2%
0 1129747
17.7%
2 861543
13.5%
- 581796
9.1%
: 581796
9.1%
5 506791
7.9%
4 439261
 
6.9%
3 387300
 
6.1%
. 290525
 
4.5%
8 122218
 
1.9%
Other values (3) 335624
 
5.2%
Latin
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6980060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1161663
16.6%
0 1129747
16.2%
2 861543
12.3%
- 581796
8.3%
: 581796
8.3%
5 506791
7.3%
4 439261
 
6.3%
3 387300
 
5.5%
T 290898
 
4.2%
Z 290898
 
4.2%
Other values (5) 748367
10.7%
Distinct227
Distinct (%)13.7%
Missing289238
Missing (%)99.4%
Memory size2.2 MiB
2025-01-08T18:40:29.003825image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length16.78433735
Min length3

Characters and Unicode

Total characters27862
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)5.5%

Sample

1st row1294.2466800739585
2nd row2872.769076848754
3rd row3250.592564219525
4th row3894.154755246927
5th row4465.683444064726
ValueCountFrequency (%)
2704.885187212414 232
 
14.0%
1241.6133704433169 74
 
4.5%
2872.769076848754 69
 
4.2%
0.0 60
 
3.6%
1292.3392160898957 49
 
3.0%
4419.575196162919 48
 
2.9%
1167.2226527660587 48
 
2.9%
2874.034733991636 46
 
2.8%
2907.040794219252 39
 
2.3%
4191.557332376314 37
 
2.2%
Other values (217) 958
57.7%
2025-01-08T18:40:29.201685image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3286
11.8%
4 3194
11.5%
2 3081
11.1%
8 2849
10.2%
7 2844
10.2%
3 2330
8.4%
5 2277
8.2%
6 2169
7.8%
9 2146
7.7%
0 2026
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26202
94.0%
Other Punctuation 1660
 
6.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3286
12.5%
4 3194
12.2%
2 3081
11.8%
8 2849
10.9%
7 2844
10.9%
3 2330
8.9%
5 2277
8.7%
6 2169
8.3%
9 2146
8.2%
0 2026
7.7%
Other Punctuation
ValueCountFrequency (%)
. 1660
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27862
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3286
11.8%
4 3194
11.5%
2 3081
11.1%
8 2849
10.2%
7 2844
10.2%
3 2330
8.4%
5 2277
8.2%
6 2169
7.8%
9 2146
7.7%
0 2026
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27862
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3286
11.8%
4 3194
11.5%
2 3081
11.1%
8 2849
10.2%
7 2844
10.2%
3 2330
8.4%
5 2277
8.2%
6 2169
7.8%
9 2146
7.7%
0 2026
7.3%

issue
Text

Distinct78
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:29.281010image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length187
Median length182
Mean length102.3443991
Min length31

Characters and Unicode

Total characters29771781
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)< 0.1%

Sample

1st rowINSTITUTION_COLLECTION_MISMATCH
2nd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;CONTINENT_DERIVED_FROM_COORDINATES;INSTITUTION_COLLECTION_MISMATCH
3rd rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;CONTINENT_DERIVED_FROM_COUNTRY;INSTITUTION_COLLECTION_MISMATCH
4th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;CONTINENT_DERIVED_FROM_COORDINATES;INSTITUTION_COLLECTION_MISMATCH
5th rowOCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT;CONTINENT_DERIVED_FROM_COUNTRY;INSTITUTION_COLLECTION_MISMATCH
ValueCountFrequency (%)
occurrence_status_inferred_from_individual_count;continent_derived_from_coordinates;institution_collection_mismatch 109406
37.6%
occurrence_status_inferred_from_individual_count;institution_collection_mismatch 60883
20.9%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;institution_collection_mismatch 37878
 
13.0%
occurrence_status_inferred_from_individual_count;continent_derived_from_coordinates;taxon_match_higherrank;institution_collection_mismatch 14042
 
4.8%
occurrence_status_inferred_from_individual_count;taxon_match_higherrank;institution_collection_mismatch 12982
 
4.5%
institution_collection_mismatch 10900
 
3.7%
continent_derived_from_coordinates;institution_collection_mismatch 8265
 
2.8%
continent_derived_from_country;institution_collection_mismatch 6709
 
2.3%
occurrence_status_inferred_from_individual_count;continent_derived_from_country;taxon_match_higherrank;institution_collection_mismatch 5077
 
1.7%
occurrence_status_inferred_from_individual_count;continent_derived_from_coordinates;taxon_match_fuzzy;institution_collection_mismatch 3620
 
1.2%
Other values (68) 21136
 
7.3%
2025-01-08T18:40:29.421401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 3094029
10.4%
T 2942498
9.9%
N 2807553
9.4%
_ 2589229
 
8.7%
O 2467649
 
8.3%
C 2378719
 
8.0%
E 2118380
 
7.1%
R 1991055
 
6.7%
U 1407852
 
4.7%
D 1340041
 
4.5%
Other values (15) 6634776
22.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 26666823
89.6%
Connector Punctuation 2589229
 
8.7%
Other Punctuation 515729
 
1.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 3094029
11.6%
T 2942498
11.0%
N 2807553
10.5%
O 2467649
9.3%
C 2378719
8.9%
E 2118380
7.9%
R 1991055
7.5%
U 1407852
 
5.3%
D 1340041
 
5.0%
S 1254931
 
4.7%
Other values (13) 4864116
18.2%
Connector Punctuation
ValueCountFrequency (%)
_ 2589229
100.0%
Other Punctuation
ValueCountFrequency (%)
; 515729
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26666823
89.6%
Common 3104958
 
10.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 3094029
11.6%
T 2942498
11.0%
N 2807553
10.5%
O 2467649
9.3%
C 2378719
8.9%
E 2118380
7.9%
R 1991055
7.5%
U 1407852
 
5.3%
D 1340041
 
5.0%
S 1254931
 
4.7%
Other values (13) 4864116
18.2%
Common
ValueCountFrequency (%)
_ 2589229
83.4%
; 515729
 
16.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29771781
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 3094029
10.4%
T 2942498
9.9%
N 2807553
9.4%
_ 2589229
 
8.7%
O 2467649
 
8.3%
C 2378719
 
8.0%
E 2118380
 
7.1%
R 1991055
 
6.7%
U 1407852
 
4.7%
D 1340041
 
4.5%
Other values (15) 6634776
22.3%

mediaType
Text

Missing 

Distinct83
Distinct (%)0.1%
Missing207500
Missing (%)71.3%
Memory size2.2 MiB
2025-01-08T18:40:29.477025image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length231
Median length21
Mean length21.95310439
Min length10

Characters and Unicode

Total characters1830845
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)0.1%

Sample

1st rowStillImage
2nd rowStillImage;StillImage;StillImage;StillImage;StillImage;StillImage
3rd rowStillImage
4th rowStillImage
5th rowStillImage;StillImage;StillImage;StillImage
ValueCountFrequency (%)
stillimage;stillimage 74683
89.6%
stillimage;stillimage;stillimage 3977
 
4.8%
stillimage 3155
 
3.8%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 396
 
0.5%
stillimage;stillimage;stillimage;stillimage 332
 
0.4%
stillimage;stillimage;stillimage;stillimage;stillimage 315
 
0.4%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 123
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 59
 
0.1%
stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage;stillimage 30
 
< 0.1%
movingimage;stillimage;stillimage;stillimage 26
 
< 0.1%
Other values (73) 302
 
0.4%
2025-01-08T18:40:29.601281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 347418
19.0%
g 174283
9.5%
i 173996
9.5%
I 173996
9.5%
m 173996
9.5%
a 173996
9.5%
e 173996
9.5%
S 173709
9.5%
t 173709
9.5%
; 90598
 
4.9%
Other values (4) 1148
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1392255
76.0%
Uppercase Letter 347992
 
19.0%
Other Punctuation 90598
 
4.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 347418
25.0%
g 174283
12.5%
i 173996
12.5%
m 173996
12.5%
a 173996
12.5%
e 173996
12.5%
t 173709
12.5%
o 287
 
< 0.1%
v 287
 
< 0.1%
n 287
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
I 173996
50.0%
S 173709
49.9%
M 287
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 90598
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1740247
95.1%
Common 90598
 
4.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 347418
20.0%
g 174283
10.0%
i 173996
10.0%
I 173996
10.0%
m 173996
10.0%
a 173996
10.0%
e 173996
10.0%
S 173709
10.0%
t 173709
10.0%
M 287
 
< 0.1%
Other values (3) 861
 
< 0.1%
Common
ValueCountFrequency (%)
; 90598
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1830845
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 347418
19.0%
g 174283
9.5%
i 173996
9.5%
I 173996
9.5%
m 173996
9.5%
a 173996
9.5%
e 173996
9.5%
S 173709
9.5%
t 173709
9.5%
; 90598
 
4.9%
Other values (4) 1148
 
0.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:29.640919image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.478215732
Min length4

Characters and Unicode

Total characters1302704
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowtrue
3rd rowfalse
4th rowtrue
5th rowfalse
ValueCountFrequency (%)
true 151786
52.2%
false 139112
47.8%
2025-01-08T18:40:29.732120image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 290898
22.3%
t 151786
11.7%
r 151786
11.7%
u 151786
11.7%
f 139112
10.7%
a 139112
10.7%
l 139112
10.7%
s 139112
10.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1302704
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 290898
22.3%
t 151786
11.7%
r 151786
11.7%
u 151786
11.7%
f 139112
10.7%
a 139112
10.7%
l 139112
10.7%
s 139112
10.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1302704
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 290898
22.3%
t 151786
11.7%
r 151786
11.7%
u 151786
11.7%
f 139112
10.7%
a 139112
10.7%
l 139112
10.7%
s 139112
10.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1302704
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 290898
22.3%
t 151786
11.7%
r 151786
11.7%
u 151786
11.7%
f 139112
10.7%
a 139112
10.7%
l 139112
10.7%
s 139112
10.7%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:29.774075image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.983825946
Min length4

Characters and Unicode

Total characters1449785
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 286193
98.4%
true 4705
 
1.6%
2025-01-08T18:40:29.866917image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 290898
20.1%
f 286193
19.7%
a 286193
19.7%
l 286193
19.7%
s 286193
19.7%
t 4705
 
0.3%
r 4705
 
0.3%
u 4705
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1449785
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 290898
20.1%
f 286193
19.7%
a 286193
19.7%
l 286193
19.7%
s 286193
19.7%
t 4705
 
0.3%
r 4705
 
0.3%
u 4705
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1449785
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 290898
20.1%
f 286193
19.7%
a 286193
19.7%
l 286193
19.7%
s 286193
19.7%
t 4705
 
0.3%
r 4705
 
0.3%
u 4705
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1449785
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 290898
20.1%
f 286193
19.7%
a 286193
19.7%
l 286193
19.7%
s 286193
19.7%
t 4705
 
0.3%
r 4705
 
0.3%
u 4705
 
0.3%
Distinct15605
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:30.063498image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.990388384
Min length1

Characters and Unicode

Total characters2033490
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3323 ?
Unique (%)1.1%

Sample

1st row2484620
2nd row7342142
3rd row2479504
4th row6170652
5th row6170887
ValueCountFrequency (%)
5231191 1635
 
0.6%
6172874 1489
 
0.5%
6171845 1145
 
0.4%
2480242 1135
 
0.4%
7191198 1017
 
0.3%
7341902 981
 
0.3%
9156140 924
 
0.3%
2481137 897
 
0.3%
7192432 862
 
0.3%
8990910 856
 
0.3%
Other values (15595) 279957
96.2%
2025-01-08T18:40:30.335325image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 265852
13.1%
4 249808
12.3%
1 240119
11.8%
7 230542
11.3%
9 222471
10.9%
8 182841
9.0%
6 181328
8.9%
0 167091
8.2%
5 149144
7.3%
3 144294
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2033490
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 265852
13.1%
4 249808
12.3%
1 240119
11.8%
7 230542
11.3%
9 222471
10.9%
8 182841
9.0%
6 181328
8.9%
0 167091
8.2%
5 149144
7.3%
3 144294
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 2033490
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 265852
13.1%
4 249808
12.3%
1 240119
11.8%
7 230542
11.3%
9 222471
10.9%
8 182841
9.0%
6 181328
8.9%
0 167091
8.2%
5 149144
7.3%
3 144294
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2033490
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 265852
13.1%
4 249808
12.3%
1 240119
11.8%
7 230542
11.3%
9 222471
10.9%
8 182841
9.0%
6 181328
8.9%
0 167091
8.2%
5 149144
7.3%
3 144294
7.1%
Distinct14746
Distinct (%)5.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-08T18:40:30.548028image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.988418581
Min length1

Characters and Unicode

Total characters2032910
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3066 ?
Unique (%)1.1%

Sample

1st row2484620
2nd row7342142
3rd row2479504
4th row6170652
5th row6170887
ValueCountFrequency (%)
5231191 1635
 
0.6%
6172874 1489
 
0.5%
6065824 1204
 
0.4%
6171845 1145
 
0.4%
2480242 1135
 
0.4%
7191198 1017
 
0.3%
7341902 981
 
0.3%
9156140 924
 
0.3%
7192432 869
 
0.3%
8990910 856
 
0.3%
Other values (14736) 279642
96.1%
2025-01-08T18:40:30.819871image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2032910
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2032910
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2032910
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 262848
12.9%
4 256924
12.6%
1 235914
11.6%
7 227945
11.2%
9 218745
10.8%
8 186874
9.2%
6 180095
8.9%
0 173457
8.5%
5 148365
7.3%
3 141743
7.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:30.871871image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters290898
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 290897
> 99.9%
0 1
 
< 0.1%
2025-01-08T18:40:30.959250image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 290897
> 99.9%
0 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 290898
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 290897
> 99.9%
0 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 290898
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 290897
> 99.9%
0 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 290898
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 290897
> 99.9%
0 1
 
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing766
Missing (%)0.3%
Memory size2.2 MiB
2025-01-08T18:40:30.998249image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters580264
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44
2nd row44
3rd row44
4th row44
5th row44
ValueCountFrequency (%)
44 290063
> 99.9%
54 67
 
< 0.1%
52 2
 
< 0.1%
2025-01-08T18:40:31.084835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 580193
> 99.9%
5 69
 
< 0.1%
2 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 580264
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 580193
> 99.9%
5 69
 
< 0.1%
2 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 580264
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 580193
> 99.9%
5 69
 
< 0.1%
2 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 580264
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 580193
> 99.9%
5 69
 
< 0.1%
2 2
 
< 0.1%
Distinct6
Distinct (%)< 0.1%
Missing770
Missing (%)0.3%
Memory size2.2 MiB
2025-01-08T18:40:31.128269image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.000017234
Min length3

Characters and Unicode

Total characters870389
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row212
2nd row212
3rd row212
4th row212
5th row212
ValueCountFrequency (%)
212 290053
> 99.9%
216 66
 
< 0.1%
359 6
 
< 0.1%
137 1
 
< 0.1%
11592253 1
 
< 0.1%
229 1
 
< 0.1%
2025-01-08T18:40:31.227720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 580176
66.7%
1 290122
33.3%
6 66
 
< 0.1%
3 8
 
< 0.1%
5 8
 
< 0.1%
9 8
 
< 0.1%
7 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 870389
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 580176
66.7%
1 290122
33.3%
6 66
 
< 0.1%
3 8
 
< 0.1%
5 8
 
< 0.1%
9 8
 
< 0.1%
7 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 870389
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 580176
66.7%
1 290122
33.3%
6 66
 
< 0.1%
3 8
 
< 0.1%
5 8
 
< 0.1%
9 8
 
< 0.1%
7 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 870389
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 580176
66.7%
1 290122
33.3%
6 66
 
< 0.1%
3 8
 
< 0.1%
5 8
 
< 0.1%
9 8
 
< 0.1%
7 1
 
< 0.1%
Distinct53
Distinct (%)< 0.1%
Missing1492
Missing (%)0.5%
Memory size2.2 MiB
2025-01-08T18:40:31.291721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length3
Mean length4.111473155
Min length3

Characters and Unicode

Total characters1189885
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st row729
2nd row729
3rd row1445
4th row1445
5th row1445
ValueCountFrequency (%)
729 145670
50.3%
7192402 33385
 
11.5%
7191147 10340
 
3.6%
1108 10163
 
3.5%
1446 9902
 
3.4%
724 8355
 
2.9%
723 7462
 
2.6%
1448 7409
 
2.6%
7190953 7141
 
2.5%
1447 7019
 
2.4%
Other values (43) 42560
 
14.7%
2025-01-08T18:40:31.410775image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 256129
21.5%
2 240651
20.2%
9 230658
19.4%
1 163879
13.8%
4 141291
11.9%
0 63100
 
5.3%
5 32060
 
2.7%
8 26025
 
2.2%
3 22156
 
1.9%
6 13936
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1189885
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 256129
21.5%
2 240651
20.2%
9 230658
19.4%
1 163879
13.8%
4 141291
11.9%
0 63100
 
5.3%
5 32060
 
2.7%
8 26025
 
2.2%
3 22156
 
1.9%
6 13936
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1189885
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 256129
21.5%
2 240651
20.2%
9 230658
19.4%
1 163879
13.8%
4 141291
11.9%
0 63100
 
5.3%
5 32060
 
2.7%
8 26025
 
2.2%
3 22156
 
1.9%
6 13936
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1189885
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 256129
21.5%
2 240651
20.2%
9 230658
19.4%
1 163879
13.8%
4 141291
11.9%
0 63100
 
5.3%
5 32060
 
2.7%
8 26025
 
2.2%
3 22156
 
1.9%
6 13936
 
1.2%
Distinct250
Distinct (%)0.1%
Missing1542
Missing (%)0.5%
Memory size2.2 MiB
2025-01-08T18:40:31.592257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length4
Mean length4.279527641
Min length4

Characters and Unicode

Total characters1238307
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st row5295
2nd row5290
3rd row9340
4th row9340
5th row9340
ValueCountFrequency (%)
5282 11999
 
4.1%
9322 11304
 
3.9%
2986 10104
 
3.5%
2877 9920
 
3.4%
5233 9902
 
3.4%
9316 9646
 
3.3%
5242 8978
 
3.1%
5235 7653
 
2.6%
5290 6987
 
2.4%
9340 6737
 
2.3%
Other values (240) 196126
67.8%
2025-01-08T18:40:31.833622image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 236924
19.1%
9 187320
15.1%
3 175452
14.2%
5 170061
13.7%
8 93668
 
7.6%
0 86294
 
7.0%
1 78151
 
6.3%
7 70985
 
5.7%
4 70809
 
5.7%
6 68643
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1238307
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 236924
19.1%
9 187320
15.1%
3 175452
14.2%
5 170061
13.7%
8 93668
 
7.6%
0 86294
 
7.0%
1 78151
 
6.3%
7 70985
 
5.7%
4 70809
 
5.7%
6 68643
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common 1238307
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 236924
19.1%
9 187320
15.1%
3 175452
14.2%
5 170061
13.7%
8 93668
 
7.6%
0 86294
 
7.0%
1 78151
 
6.3%
7 70985
 
5.7%
4 70809
 
5.7%
6 68643
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1238307
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 236924
19.1%
9 187320
15.1%
3 175452
14.2%
5 170061
13.7%
8 93668
 
7.6%
0 86294
 
7.0%
1 78151
 
6.3%
7 70985
 
5.7%
4 70809
 
5.7%
6 68643
 
5.5%
Distinct2200
Distinct (%)0.8%
Missing1404
Missing (%)0.5%
Memory size2.2 MiB
2025-01-08T18:40:32.036083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.002994881
Min length7

Characters and Unicode

Total characters2027325
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique162 ?
Unique (%)0.1%

Sample

1st row2484612
2nd row2490714
3rd row2479503
4th row9623552
5th row2479496
ValueCountFrequency (%)
2490714 5647
 
2.0%
2481739 3893
 
1.3%
2480996 3593
 
1.2%
2486114 3364
 
1.2%
2492321 3110
 
1.1%
9405810 2980
 
1.0%
2492941 2742
 
0.9%
2491468 2716
 
0.9%
2481126 2634
 
0.9%
2482468 2475
 
0.9%
Other values (2190) 256340
88.5%
2025-01-08T18:40:32.293828image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 406143
20.0%
2 385130
19.0%
9 236025
11.6%
8 233741
11.5%
7 147304
 
7.3%
1 135409
 
6.7%
0 129989
 
6.4%
3 125202
 
6.2%
6 121813
 
6.0%
5 106569
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2027325
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 406143
20.0%
2 385130
19.0%
9 236025
11.6%
8 233741
11.5%
7 147304
 
7.3%
1 135409
 
6.7%
0 129989
 
6.4%
3 125202
 
6.2%
6 121813
 
6.0%
5 106569
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Common 2027325
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 406143
20.0%
2 385130
19.0%
9 236025
11.6%
8 233741
11.5%
7 147304
 
7.3%
1 135409
 
6.7%
0 129989
 
6.4%
3 125202
 
6.2%
6 121813
 
6.0%
5 106569
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2027325
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 406143
20.0%
2 385130
19.0%
9 236025
11.6%
8 233741
11.5%
7 147304
 
7.3%
1 135409
 
6.7%
0 129989
 
6.4%
3 125202
 
6.2%
6 121813
 
6.0%
5 106569
 
5.3%

speciesKey
Text

Missing 

Distinct7224
Distinct (%)2.6%
Missing10568
Missing (%)3.6%
Memory size2.2 MiB
2025-01-08T18:40:32.511455image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.006963222
Min length7

Characters and Unicode

Total characters1964262
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique863 ?
Unique (%)0.3%

Sample

1st row2484620
2nd row2490774
3rd row2479504
4th row2479311
5th row2479498
ValueCountFrequency (%)
5231190 1816
 
0.6%
9809229 1734
 
0.6%
5229493 1415
 
0.5%
2490719 1395
 
0.5%
2494422 1359
 
0.5%
7901064 1357
 
0.5%
9616058 1340
 
0.5%
9705453 1219
 
0.4%
6065824 1204
 
0.4%
2480637 1195
 
0.4%
Other values (7214) 266296
95.0%
2025-01-08T18:40:32.769657image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 362903
18.5%
4 315405
16.1%
9 214561
10.9%
8 207833
10.6%
7 153593
7.8%
5 151542
7.7%
0 150344
7.7%
1 146530
7.5%
3 135043
 
6.9%
6 126508
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1964262
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 362903
18.5%
4 315405
16.1%
9 214561
10.9%
8 207833
10.6%
7 153593
7.8%
5 151542
7.7%
0 150344
7.7%
1 146530
7.5%
3 135043
 
6.9%
6 126508
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Common 1964262
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 362903
18.5%
4 315405
16.1%
9 214561
10.9%
8 207833
10.6%
7 153593
7.8%
5 151542
7.7%
0 150344
7.7%
1 146530
7.5%
3 135043
 
6.9%
6 126508
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1964262
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 362903
18.5%
4 315405
16.1%
9 214561
10.9%
8 207833
10.6%
7 153593
7.8%
5 151542
7.7%
0 150344
7.7%
1 146530
7.5%
3 135043
 
6.9%
6 126508
 
6.4%

species
Text

Missing 

Distinct7216
Distinct (%)2.6%
Missing10568
Missing (%)3.6%
Memory size2.2 MiB
2025-01-08T18:40:32.935065image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length28
Mean length17.82155317
Min length9

Characters and Unicode

Total characters4995916
Distinct characters53
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique862 ?
Unique (%)0.3%

Sample

1st rowVidua orientalis
2nd rowTurdus viscivorus
3rd rowNeophema splendida
4th rowPlatycercus elegans
5th rowPolytelis anthopeplus
ValueCountFrequency (%)
turdus 5630
 
1.0%
calidris 3874
 
0.7%
falco 3588
 
0.6%
passer 3103
 
0.6%
pycnonotus 3090
 
0.6%
accipiter 2976
 
0.5%
sylvia 2721
 
0.5%
emberiza 2715
 
0.5%
larus 2625
 
0.5%
buteo 2500
 
0.4%
Other values (6121) 528306
94.2%
2025-01-08T18:40:33.157439image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 529984
 
10.6%
s 420166
 
8.4%
i 417749
 
8.4%
r 355972
 
7.1%
u 347702
 
7.0%
o 311426
 
6.2%
e 289384
 
5.8%
l 285661
 
5.7%
280798
 
5.6%
n 251540
 
5.0%
Other values (43) 1505534
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4434788
88.8%
Space Separator 280798
 
5.6%
Uppercase Letter 280330
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 529984
12.0%
s 420166
9.5%
i 417749
9.4%
r 355972
 
8.0%
u 347702
 
7.8%
o 311426
 
7.0%
e 289384
 
6.5%
l 285661
 
6.4%
n 251540
 
5.7%
c 250267
 
5.6%
Other values (16) 974937
22.0%
Uppercase Letter
ValueCountFrequency (%)
P 45208
16.1%
C 44095
15.7%
A 32371
11.5%
T 22022
7.9%
S 21712
7.7%
M 17089
 
6.1%
L 16790
 
6.0%
G 11340
 
4.0%
E 9976
 
3.6%
F 9059
 
3.2%
Other values (16) 50668
18.1%
Space Separator
ValueCountFrequency (%)
280798
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4715118
94.4%
Common 280798
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 529984
11.2%
s 420166
 
8.9%
i 417749
 
8.9%
r 355972
 
7.5%
u 347702
 
7.4%
o 311426
 
6.6%
e 289384
 
6.1%
l 285661
 
6.1%
n 251540
 
5.3%
c 250267
 
5.3%
Other values (42) 1255267
26.6%
Common
ValueCountFrequency (%)
280798
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4995916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 529984
 
10.6%
s 420166
 
8.4%
i 417749
 
8.4%
r 355972
 
7.1%
u 347702
 
7.0%
o 311426
 
6.2%
e 289384
 
5.8%
l 285661
 
5.7%
280798
 
5.6%
n 251540
 
5.0%
Other values (43) 1505534
30.1%
Distinct14746
Distinct (%)5.1%
Missing1
Missing (%)< 0.1%
Memory size2.2 MiB
2025-01-08T18:40:33.346376image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length75
Median length64
Mean length33.57232629
Min length4

Characters and Unicode

Total characters9766089
Distinct characters82
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3066 ?
Unique (%)1.1%

Sample

1st rowVidua orientalis Heuglin, 1870
2nd rowTurdus viscivorus viscivorus
3rd rowNeophema splendida (Gould, 1841)
4th rowPlatycercus elegans melanopterus North, 1906
5th rowPolytelis anthopeplus monarchoides Schodde, 1993
ValueCountFrequency (%)
linnaeus 46619
 
4.1%
1758 36632
 
3.2%
temminck 9316
 
0.8%
1766 8978
 
0.8%
vieillot 8962
 
0.8%
8270
 
0.7%
1789 7244
 
0.6%
1821 6618
 
0.6%
horsfield 6307
 
0.6%
gmelin 5819
 
0.5%
Other values (9799) 988193
87.2%
2025-01-08T18:40:33.613920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
842061
 
8.6%
a 828778
 
8.5%
i 702778
 
7.2%
s 656289
 
6.7%
e 559707
 
5.7%
r 534908
 
5.5%
u 520424
 
5.3%
n 514076
 
5.3%
l 461054
 
4.7%
o 460789
 
4.7%
Other values (72) 3685225
37.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7139395
73.1%
Space Separator 842061
 
8.6%
Decimal Number 765010
 
7.8%
Uppercase Letter 537011
 
5.5%
Other Punctuation 243588
 
2.5%
Close Punctuation 119038
 
1.2%
Open Punctuation 119038
 
1.2%
Dash Punctuation 947
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 828778
11.6%
i 702778
9.8%
s 656289
9.2%
e 559707
 
7.8%
r 534908
 
7.5%
u 520424
 
7.3%
n 514076
 
7.2%
l 461054
 
6.5%
o 460789
 
6.5%
c 356956
 
5.0%
Other values (25) 1543636
21.6%
Uppercase Letter
ValueCountFrequency (%)
L 81449
15.2%
P 55576
10.3%
C 54000
10.1%
S 46590
8.7%
A 36902
 
6.9%
G 34502
 
6.4%
T 34045
 
6.3%
B 27485
 
5.1%
M 26544
 
4.9%
H 24693
 
4.6%
Other values (17) 115225
21.5%
Decimal Number
ValueCountFrequency (%)
1 230603
30.1%
8 167122
21.8%
7 102767
13.4%
5 56503
 
7.4%
6 44366
 
5.8%
9 43486
 
5.7%
2 37523
 
4.9%
3 33722
 
4.4%
0 26573
 
3.5%
4 22345
 
2.9%
Other Punctuation
ValueCountFrequency (%)
, 191266
78.5%
. 43873
 
18.0%
& 8267
 
3.4%
' 179
 
0.1%
? 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
842061
100.0%
Close Punctuation
ValueCountFrequency (%)
) 119038
100.0%
Open Punctuation
ValueCountFrequency (%)
( 119038
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 947
100.0%
Math Symbol
ValueCountFrequency (%)
× 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7676406
78.6%
Common 2089683
 
21.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 828778
 
10.8%
i 702778
 
9.2%
s 656289
 
8.5%
e 559707
 
7.3%
r 534908
 
7.0%
u 520424
 
6.8%
n 514076
 
6.7%
l 461054
 
6.0%
o 460789
 
6.0%
c 356956
 
4.7%
Other values (52) 2080647
27.1%
Common
ValueCountFrequency (%)
842061
40.3%
1 230603
 
11.0%
, 191266
 
9.2%
8 167122
 
8.0%
) 119038
 
5.7%
( 119038
 
5.7%
7 102767
 
4.9%
5 56503
 
2.7%
6 44366
 
2.1%
. 43873
 
2.1%
Other values (10) 173046
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9762382
> 99.9%
None 3707
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
842061
 
8.6%
a 828778
 
8.5%
i 702778
 
7.2%
s 656289
 
6.7%
e 559707
 
5.7%
r 534908
 
5.5%
u 520424
 
5.3%
n 514076
 
5.3%
l 461054
 
4.7%
o 460789
 
4.7%
Other values (61) 3681518
37.7%
None
ValueCountFrequency (%)
ü 2457
66.3%
ø 471
 
12.7%
é 262
 
7.1%
ä 242
 
6.5%
á 177
 
4.8%
É 42
 
1.1%
è 40
 
1.1%
ë 5
 
0.1%
ñ 5
 
0.1%
ö 5
 
0.1%
Distinct27734
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:33.795774image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length73
Mean length38.15606845
Min length3

Characters and Unicode

Total characters11099524
Distinct characters99
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8751 ?
Unique (%)3.0%

Sample

1st rowVidua orientalis cf Heuglin, 1871
2nd rowTurdus viscivorus viscivorus Linnaeus, 1758
3rd rowNeophema splendida Gould, 1841
4th rowPlatycercus elegans melanopterus North, 1906
5th rowPolytelis anthopeplus monarchoides
ValueCountFrequency (%)
linnaeus 87601
 
6.6%
1758 63048
 
4.8%
temminck 13095
 
1.0%
vieillot 10951
 
0.8%
10616
 
0.8%
gmelin 9488
 
0.7%
horsfield 8374
 
0.6%
1766 7987
 
0.6%
1789 5944
 
0.5%
1821 5917
 
0.4%
Other values (11808) 1095813
83.1%
2025-01-08T18:40:34.042694image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1027983
 
9.3%
a 957166
 
8.6%
i 794136
 
7.2%
s 749765
 
6.8%
e 662963
 
6.0%
n 638053
 
5.7%
r 590663
 
5.3%
u 588433
 
5.3%
l 507153
 
4.6%
o 490037
 
4.4%
Other values (89) 4093172
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8068871
72.7%
Space Separator 1027983
 
9.3%
Decimal Number 872433
 
7.9%
Uppercase Letter 606433
 
5.5%
Other Punctuation 282499
 
2.5%
Open Punctuation 120086
 
1.1%
Close Punctuation 120028
 
1.1%
Dash Punctuation 908
 
< 0.1%
Math Symbol 282
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 957166
11.9%
i 794136
9.8%
s 749765
9.3%
e 662963
 
8.2%
n 638053
 
7.9%
r 590663
 
7.3%
u 588433
 
7.3%
l 507153
 
6.3%
o 490037
 
6.1%
t 388478
 
4.8%
Other values (30) 1702024
21.1%
Uppercase Letter
ValueCountFrequency (%)
L 123494
20.4%
P 57198
9.4%
C 50372
8.3%
S 50352
8.3%
T 36890
 
6.1%
A 36500
 
6.0%
G 35206
 
5.8%
B 31451
 
5.2%
M 30799
 
5.1%
H 30507
 
5.0%
Other values (16) 123664
20.4%
Other Punctuation
ValueCountFrequency (%)
, 229223
81.1%
. 42279
 
15.0%
& 9861
 
3.5%
' 558
 
0.2%
? 301
 
0.1%
" 142
 
0.1%
/ 69
 
< 0.1%
: 43
 
< 0.1%
\ 16
 
< 0.1%
! 4
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 255589
29.3%
8 193807
22.2%
7 127628
14.6%
5 83250
 
9.5%
9 44056
 
5.0%
6 43070
 
4.9%
2 39450
 
4.5%
3 33961
 
3.9%
4 26105
 
3.0%
0 25517
 
2.9%
Math Symbol
ValueCountFrequency (%)
< 140
49.6%
> 131
46.5%
= 9
 
3.2%
2
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 119957
99.9%
] 42
 
< 0.1%
} 29
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 120015
99.9%
[ 71
 
0.1%
Space Separator
ValueCountFrequency (%)
1027983
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 908
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8675294
78.2%
Common 2424220
 
21.8%
Greek 10
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 957166
11.0%
i 794136
 
9.2%
s 749765
 
8.6%
e 662963
 
7.6%
n 638053
 
7.4%
r 590663
 
6.8%
u 588433
 
6.8%
l 507153
 
5.8%
o 490037
 
5.6%
t 388478
 
4.5%
Other values (55) 2308447
26.6%
Common
ValueCountFrequency (%)
1027983
42.4%
1 255589
 
10.5%
, 229223
 
9.5%
8 193807
 
8.0%
7 127628
 
5.3%
( 120015
 
5.0%
) 119957
 
4.9%
5 83250
 
3.4%
9 44056
 
1.8%
6 43070
 
1.8%
Other values (23) 179642
 
7.4%
Greek
ValueCountFrequency (%)
δ 10
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11090283
99.9%
None 9239
 
0.1%
Math Operators 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1027983
 
9.3%
a 957166
 
8.6%
i 794136
 
7.2%
s 749765
 
6.8%
e 662963
 
6.0%
n 638053
 
5.8%
r 590663
 
5.3%
u 588433
 
5.3%
l 507153
 
4.6%
o 490037
 
4.4%
Other values (74) 4083931
36.8%
None
ValueCountFrequency (%)
ü 7442
80.5%
é 473
 
5.1%
ø 466
 
5.0%
ä 384
 
4.2%
á 246
 
2.7%
ö 58
 
0.6%
ï 55
 
0.6%
ë 52
 
0.6%
è 46
 
0.5%
δ 10
 
0.1%
Other values (4) 7
 
0.1%
Math Operators
ValueCountFrequency (%)
2
100.0%

protocol
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:34.096696image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters3199878
Distinct characters10
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDWC_ARCHIVE
2nd rowDWC_ARCHIVE
3rd rowDWC_ARCHIVE
4th rowDWC_ARCHIVE
5th rowDWC_ARCHIVE
ValueCountFrequency (%)
dwc_archive 290898
100.0%
2025-01-08T18:40:34.187795image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 581796
18.2%
D 290898
9.1%
W 290898
9.1%
_ 290898
9.1%
A 290898
9.1%
R 290898
9.1%
H 290898
9.1%
I 290898
9.1%
V 290898
9.1%
E 290898
9.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2908980
90.9%
Connector Punctuation 290898
 
9.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 581796
20.0%
D 290898
10.0%
W 290898
10.0%
A 290898
10.0%
R 290898
10.0%
H 290898
10.0%
I 290898
10.0%
V 290898
10.0%
E 290898
10.0%
Connector Punctuation
ValueCountFrequency (%)
_ 290898
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2908980
90.9%
Common 290898
 
9.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 581796
20.0%
D 290898
10.0%
W 290898
10.0%
A 290898
10.0%
R 290898
10.0%
H 290898
10.0%
I 290898
10.0%
V 290898
10.0%
E 290898
10.0%
Common
ValueCountFrequency (%)
_ 290898
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3199878
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 581796
18.2%
D 290898
9.1%
W 290898
9.1%
_ 290898
9.1%
A 290898
9.1%
R 290898
9.1%
H 290898
9.1%
I 290898
9.1%
V 290898
9.1%
E 290898
9.1%
Distinct24995
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:34.285591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length23.99487105
Min length20

Characters and Unicode

Total characters6980060
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2133 ?
Unique (%)0.7%

Sample

1st row2025-01-03T11:41:38.952Z
2nd row2025-01-03T11:41:39.036Z
3rd row2025-01-03T11:41:41.369Z
4th row2025-01-03T11:41:41.370Z
5th row2025-01-03T11:41:41.379Z
ValueCountFrequency (%)
2025-01-03t11:42:05.126z 149
 
0.1%
2025-01-03t11:42:05.124z 149
 
0.1%
2025-01-03t11:42:05.005z 148
 
0.1%
2025-01-03t11:42:05.125z 146
 
0.1%
2025-01-03t11:42:05.127z 145
 
< 0.1%
2025-01-03t11:42:05.122z 145
 
< 0.1%
2025-01-03t11:42:05.010z 142
 
< 0.1%
2025-01-03t11:42:04.999z 141
 
< 0.1%
2025-01-03t11:42:05.042z 139
 
< 0.1%
2025-01-03t11:42:04.998z 138
 
< 0.1%
Other values (24985) 289456
99.5%
2025-01-08T18:40:34.456266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1161663
16.6%
0 1129747
16.2%
2 861543
12.3%
- 581796
8.3%
: 581796
8.3%
5 506791
7.3%
4 439261
 
6.3%
3 387300
 
5.5%
T 290898
 
4.2%
Z 290898
 
4.2%
Other values (5) 748367
10.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4944147
70.8%
Other Punctuation 872321
 
12.5%
Dash Punctuation 581796
 
8.3%
Uppercase Letter 581796
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1161663
23.5%
0 1129747
22.9%
2 861543
17.4%
5 506791
10.3%
4 439261
 
8.9%
3 387300
 
7.8%
8 122218
 
2.5%
6 117907
 
2.4%
9 114950
 
2.3%
7 102767
 
2.1%
Other Punctuation
ValueCountFrequency (%)
: 581796
66.7%
. 290525
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 581796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6398264
91.7%
Latin 581796
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1161663
18.2%
0 1129747
17.7%
2 861543
13.5%
- 581796
9.1%
: 581796
9.1%
5 506791
7.9%
4 439261
 
6.9%
3 387300
 
6.1%
. 290525
 
4.5%
8 122218
 
1.9%
Other values (3) 335624
 
5.2%
Latin
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6980060
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1161663
16.6%
0 1129747
16.2%
2 861543
12.3%
- 581796
8.3%
: 581796
8.3%
5 506791
7.3%
4 439261
 
6.3%
3 387300
 
5.5%
T 290898
 
4.2%
Z 290898
 
4.2%
Other values (5) 748367
10.7%

lastCrawled
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:34.516781image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters6981552
Distinct characters12
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2025-01-03T11:34:30.428Z
2nd row2025-01-03T11:34:30.428Z
3rd row2025-01-03T11:34:30.428Z
4th row2025-01-03T11:34:30.428Z
5th row2025-01-03T11:34:30.428Z
ValueCountFrequency (%)
2025-01-03t11:34:30.428z 290898
100.0%
2025-01-08T18:40:34.615359image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1163592
16.7%
2 872694
12.5%
1 872694
12.5%
3 872694
12.5%
- 581796
8.3%
: 581796
8.3%
4 581796
8.3%
5 290898
 
4.2%
T 290898
 
4.2%
. 290898
 
4.2%
Other values (2) 581796
8.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4945266
70.8%
Other Punctuation 872694
 
12.5%
Dash Punctuation 581796
 
8.3%
Uppercase Letter 581796
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1163592
23.5%
2 872694
17.6%
1 872694
17.6%
3 872694
17.6%
4 581796
11.8%
5 290898
 
5.9%
8 290898
 
5.9%
Other Punctuation
ValueCountFrequency (%)
: 581796
66.7%
. 290898
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 581796
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6399756
91.7%
Latin 581796
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1163592
18.2%
2 872694
13.6%
1 872694
13.6%
3 872694
13.6%
- 581796
9.1%
: 581796
9.1%
4 581796
9.1%
5 290898
 
4.5%
. 290898
 
4.5%
8 290898
 
4.5%
Latin
ValueCountFrequency (%)
T 290898
50.0%
Z 290898
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6981552
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1163592
16.7%
2 872694
12.5%
1 872694
12.5%
3 872694
12.5%
- 581796
8.3%
: 581796
8.3%
4 581796
8.3%
5 290898
 
4.2%
T 290898
 
4.2%
. 290898
 
4.2%
Other values (2) 581796
8.3%

repatriated
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing46939
Missing (%)16.1%
Memory size2.2 MiB
2025-01-08T18:40:34.654989image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length4
Mean length4.284781459
Min length4

Characters and Unicode

Total characters1045311
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowtrue
3rd rowtrue
4th rowtrue
5th rowtrue
ValueCountFrequency (%)
true 174484
71.5%
false 69475
 
28.5%
2025-01-08T18:40:34.745518image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 243959
23.3%
t 174484
16.7%
r 174484
16.7%
u 174484
16.7%
f 69475
 
6.6%
a 69475
 
6.6%
l 69475
 
6.6%
s 69475
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1045311
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 243959
23.3%
t 174484
16.7%
r 174484
16.7%
u 174484
16.7%
f 69475
 
6.6%
a 69475
 
6.6%
l 69475
 
6.6%
s 69475
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 1045311
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 243959
23.3%
t 174484
16.7%
r 174484
16.7%
u 174484
16.7%
f 69475
 
6.6%
a 69475
 
6.6%
l 69475
 
6.6%
s 69475
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1045311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 243959
23.3%
t 174484
16.7%
r 174484
16.7%
u 174484
16.7%
f 69475
 
6.6%
a 69475
 
6.6%
l 69475
 
6.6%
s 69475
 
6.6%

isSequenced
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:34.903948image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters1454490
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfalse
2nd rowfalse
3rd rowfalse
4th rowfalse
5th rowfalse
ValueCountFrequency (%)
false 290898
100.0%
2025-01-08T18:40:34.991521image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
f 290898
20.0%
a 290898
20.0%
l 290898
20.0%
s 290898
20.0%
e 290898
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1454490
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
f 290898
20.0%
a 290898
20.0%
l 290898
20.0%
s 290898
20.0%
e 290898
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1454490
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
f 290898
20.0%
a 290898
20.0%
l 290898
20.0%
s 290898
20.0%
e 290898
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1454490
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
f 290898
20.0%
a 290898
20.0%
l 290898
20.0%
s 290898
20.0%
e 290898
20.0%

gbifRegion
Text

Missing 

Distinct7
Distinct (%)< 0.1%
Missing50475
Missing (%)17.4%
Memory size2.2 MiB
2025-01-08T18:40:35.034713image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length6.326869725
Min length4

Characters and Unicode

Total characters1521125
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEUROPE
2nd rowOCEANIA
3rd rowOCEANIA
4th rowOCEANIA
5th rowAFRICA
ValueCountFrequency (%)
asia 91619
38.1%
europe 88150
36.7%
latin_america 31593
 
13.1%
africa 19035
 
7.9%
north_america 4960
 
2.1%
oceania 4770
 
2.0%
antarctica 296
 
0.1%
2025-01-08T18:40:35.128418image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 336435
22.1%
E 217623
14.3%
I 183866
12.1%
R 148994
9.8%
O 97880
 
6.4%
S 91619
 
6.0%
U 88150
 
5.8%
P 88150
 
5.8%
C 60950
 
4.0%
N 41619
 
2.7%
Other values (6) 165839
10.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1484572
97.6%
Connector Punctuation 36553
 
2.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 336435
22.7%
E 217623
14.7%
I 183866
12.4%
R 148994
10.0%
O 97880
 
6.6%
S 91619
 
6.2%
U 88150
 
5.9%
P 88150
 
5.9%
C 60950
 
4.1%
N 41619
 
2.8%
Other values (5) 129286
 
8.7%
Connector Punctuation
ValueCountFrequency (%)
_ 36553
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1484572
97.6%
Common 36553
 
2.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 336435
22.7%
E 217623
14.7%
I 183866
12.4%
R 148994
10.0%
O 97880
 
6.6%
S 91619
 
6.2%
U 88150
 
5.9%
P 88150
 
5.9%
C 60950
 
4.1%
N 41619
 
2.8%
Other values (5) 129286
 
8.7%
Common
ValueCountFrequency (%)
_ 36553
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1521125
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 336435
22.1%
E 217623
14.3%
I 183866
12.1%
R 148994
9.8%
O 97880
 
6.4%
S 91619
 
6.0%
U 88150
 
5.8%
P 88150
 
5.8%
C 60950
 
4.0%
N 41619
 
2.7%
Other values (6) 165839
10.9%

publishedByGbifRegion
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.2 MiB
2025-01-08T18:40:35.169422image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters1745388
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowEUROPE
2nd rowEUROPE
3rd rowEUROPE
4th rowEUROPE
5th rowEUROPE
ValueCountFrequency (%)
europe 290898
100.0%
2025-01-08T18:40:35.268660image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 581796
33.3%
U 290898
16.7%
R 290898
16.7%
O 290898
16.7%
P 290898
16.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1745388
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 581796
33.3%
U 290898
16.7%
R 290898
16.7%
O 290898
16.7%
P 290898
16.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 1745388
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 581796
33.3%
U 290898
16.7%
R 290898
16.7%
O 290898
16.7%
P 290898
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1745388
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 581796
33.3%
U 290898
16.7%
R 290898
16.7%
O 290898
16.7%
P 290898
16.7%

level0Gid
Text

Missing 

Distinct217
Distinct (%)0.2%
Missing158562
Missing (%)54.5%
Memory size2.2 MiB
2025-01-08T18:40:35.427344image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters397008
Distinct characters32
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)< 0.1%

Sample

1st rowNLD
2nd rowAUS
3rd rowGMB
4th rowNZL
5th rowMDG
ValueCountFrequency (%)
nld 46099
34.8%
idn 43479
32.9%
sur 3814
 
2.9%
usa 1760
 
1.3%
gbr 1501
 
1.1%
rus 1424
 
1.1%
deu 1330
 
1.0%
chn 1268
 
1.0%
bra 1179
 
0.9%
twn 1083
 
0.8%
Other values (207) 29399
22.2%
2025-01-08T18:40:35.652358image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 97554
24.6%
D 94303
23.8%
L 52176
13.1%
I 46877
11.8%
R 12813
 
3.2%
U 12497
 
3.1%
A 12065
 
3.0%
S 12061
 
3.0%
E 7425
 
1.9%
G 5996
 
1.5%
Other values (22) 43241
10.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 396526
99.9%
Decimal Number 482
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 97554
24.6%
D 94303
23.8%
L 52176
13.2%
I 46877
11.8%
R 12813
 
3.2%
U 12497
 
3.2%
A 12065
 
3.0%
S 12061
 
3.0%
E 7425
 
1.9%
G 5996
 
1.5%
Other values (16) 42759
10.8%
Decimal Number
ValueCountFrequency (%)
0 241
50.0%
1 180
37.3%
3 30
 
6.2%
6 29
 
6.0%
2 1
 
0.2%
7 1
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 396526
99.9%
Common 482
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 97554
24.6%
D 94303
23.8%
L 52176
13.2%
I 46877
11.8%
R 12813
 
3.2%
U 12497
 
3.2%
A 12065
 
3.0%
S 12061
 
3.0%
E 7425
 
1.9%
G 5996
 
1.5%
Other values (16) 42759
10.8%
Common
ValueCountFrequency (%)
0 241
50.0%
1 180
37.3%
3 30
 
6.2%
6 29
 
6.0%
2 1
 
0.2%
7 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 397008
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 97554
24.6%
D 94303
23.8%
L 52176
13.1%
I 46877
11.8%
R 12813
 
3.2%
U 12497
 
3.1%
A 12065
 
3.0%
S 12061
 
3.0%
E 7425
 
1.9%
G 5996
 
1.5%
Other values (22) 43241
10.9%

level0Name
Text

Missing 

Distinct217
Distinct (%)0.2%
Missing158562
Missing (%)54.5%
Memory size2.2 MiB
2025-01-08T18:40:35.839734image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length30
Mean length9.584957986
Min length4

Characters and Unicode

Total characters1268435
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)< 0.1%

Sample

1st rowNetherlands
2nd rowAustralia
3rd rowGambia
4th rowNew Zealand
5th rowMadagascar
ValueCountFrequency (%)
netherlands 46099
31.9%
indonesia 43479
30.1%
suriname 3814
 
2.6%
united 3266
 
2.3%
states 1764
 
1.2%
kingdom 1501
 
1.0%
russia 1424
 
1.0%
germany 1330
 
0.9%
china 1268
 
0.9%
and 1222
 
0.8%
Other values (258) 39462
27.3%
2025-01-08T18:40:36.091133image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 166311
13.1%
e 163207
12.9%
a 144224
11.4%
d 103145
 
8.1%
s 101363
 
8.0%
i 80825
 
6.4%
r 65408
 
5.2%
t 61310
 
4.8%
l 57995
 
4.6%
o 54929
 
4.3%
Other values (53) 269718
21.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1112019
87.7%
Uppercase Letter 143274
 
11.3%
Space Separator 12293
 
1.0%
Other Punctuation 442
 
< 0.1%
Open Punctuation 156
 
< 0.1%
Close Punctuation 156
 
< 0.1%
Dash Punctuation 95
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 166311
15.0%
e 163207
14.7%
a 144224
13.0%
d 103145
9.3%
s 101363
9.1%
i 80825
7.3%
r 65408
 
5.9%
t 61310
 
5.5%
l 57995
 
5.2%
o 54929
 
4.9%
Other values (21) 113302
10.2%
Uppercase Letter
ValueCountFrequency (%)
N 47390
33.1%
I 46184
32.2%
S 11097
 
7.7%
C 5174
 
3.6%
A 4338
 
3.0%
T 3736
 
2.6%
G 3519
 
2.5%
U 3457
 
2.4%
B 2858
 
2.0%
K 2492
 
1.7%
Other values (15) 13029
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 432
97.7%
. 8
 
1.8%
' 2
 
0.5%
Space Separator
ValueCountFrequency (%)
12293
100.0%
Open Punctuation
ValueCountFrequency (%)
( 156
100.0%
Close Punctuation
ValueCountFrequency (%)
) 156
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 95
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1255293
99.0%
Common 13142
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 166311
13.2%
e 163207
13.0%
a 144224
11.5%
d 103145
8.2%
s 101363
 
8.1%
i 80825
 
6.4%
r 65408
 
5.2%
t 61310
 
4.9%
l 57995
 
4.6%
o 54929
 
4.4%
Other values (46) 256576
20.4%
Common
ValueCountFrequency (%)
12293
93.5%
, 432
 
3.3%
( 156
 
1.2%
) 156
 
1.2%
- 95
 
0.7%
. 8
 
0.1%
' 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1267470
99.9%
None 965
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 166311
13.1%
e 163207
12.9%
a 144224
11.4%
d 103145
 
8.1%
s 101363
 
8.0%
i 80825
 
6.4%
r 65408
 
5.2%
t 61310
 
4.8%
l 57995
 
4.6%
o 54929
 
4.3%
Other values (47) 268753
21.2%
None
ValueCountFrequency (%)
ç 470
48.7%
é 450
46.6%
í 21
 
2.2%
ã 21
 
2.2%
ô 2
 
0.2%
Å 1
 
0.1%

level1Gid
Text

Missing 

Distinct1487
Distinct (%)1.1%
Missing159606
Missing (%)54.9%
Memory size2.2 MiB
2025-01-08T18:40:36.292779image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.463638302
Min length6

Characters and Unicode

Total characters979916
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique311 ?
Unique (%)0.2%

Sample

1st rowNLD.14_1
2nd rowAUS.8_1
3rd rowGMB.4_1
4th rowNZL.12_1
5th rowMDG.2_1
ValueCountFrequency (%)
idn.9_1 14953
 
11.4%
nld.14_1 10559
 
8.0%
nld.9_1 9828
 
7.5%
nld.3_1 4946
 
3.8%
nld.4_1 4599
 
3.5%
idn.32_1 4239
 
3.2%
nld.11_1 3806
 
2.9%
nld.8_1 3327
 
2.5%
idn.21_1 2715
 
2.1%
idn.19_1 2571
 
2.0%
Other values (1477) 69749
53.1%
2025-01-08T18:40:36.564419image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 182381
18.6%
_ 131288
13.4%
. 130974
13.4%
N 97543
10.0%
D 94303
9.6%
L 52020
 
5.3%
I 46874
 
4.8%
9 32947
 
3.4%
2 29421
 
3.0%
4 21865
 
2.2%
Other values (28) 160300
16.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 393406
40.1%
Decimal Number 324248
33.1%
Connector Punctuation 131288
 
13.4%
Other Punctuation 130974
 
13.4%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 97543
24.8%
D 94303
24.0%
L 52020
13.2%
I 46874
11.9%
R 12803
 
3.3%
U 12027
 
3.1%
S 11947
 
3.0%
A 11663
 
3.0%
E 7425
 
1.9%
G 5986
 
1.5%
Other values (16) 40815
10.4%
Decimal Number
ValueCountFrequency (%)
1 182381
56.2%
9 32947
 
10.2%
2 29421
 
9.1%
4 21865
 
6.7%
3 20675
 
6.4%
0 8105
 
2.5%
8 8018
 
2.5%
5 7788
 
2.4%
7 6598
 
2.0%
6 6450
 
2.0%
Connector Punctuation
ValueCountFrequency (%)
_ 131288
100.0%
Other Punctuation
ValueCountFrequency (%)
. 130974
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 586510
59.9%
Latin 393406
40.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 97543
24.8%
D 94303
24.0%
L 52020
13.2%
I 46874
11.9%
R 12803
 
3.3%
U 12027
 
3.1%
S 11947
 
3.0%
A 11663
 
3.0%
E 7425
 
1.9%
G 5986
 
1.5%
Other values (16) 40815
10.4%
Common
ValueCountFrequency (%)
1 182381
31.1%
_ 131288
22.4%
. 130974
22.3%
9 32947
 
5.6%
2 29421
 
5.0%
4 21865
 
3.7%
3 20675
 
3.5%
0 8105
 
1.4%
8 8018
 
1.4%
5 7788
 
1.3%
Other values (2) 13048
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 979916
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 182381
18.6%
_ 131288
13.4%
. 130974
13.4%
N 97543
10.0%
D 94303
9.6%
L 52020
 
5.3%
I 46874
 
4.8%
9 32947
 
3.4%
2 29421
 
3.0%
4 21865
 
2.2%
Other values (28) 160300
16.4%

level1Name
Text

Missing 

Distinct1461
Distinct (%)1.1%
Missing159606
Missing (%)54.9%
Memory size2.2 MiB
2025-01-08T18:40:36.749660image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length29
Mean length10.45431557
Min length3

Characters and Unicode

Total characters1372568
Distinct characters113
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique304 ?
Unique (%)0.2%

Sample

1st rowZuid-Holland
2nd rowSouth Australia
3rd rowNorth Bank
4th rowOtago
5th rowAntsiranana
ValueCountFrequency (%)
barat 18889
 
10.3%
jawa 16803
 
9.2%
zuid-holland 10365
 
5.7%
noord-holland 9828
 
5.4%
utara 9113
 
5.0%
sumatera 6314
 
3.5%
fryslân 4946
 
2.7%
maluku 4793
 
2.6%
gelderland 4599
 
2.5%
timur 4042
 
2.2%
Other values (1597) 93030
50.9%
2025-01-08T18:40:36.992373image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 231846
16.9%
r 105670
 
7.7%
l 90239
 
6.6%
n 84908
 
6.2%
e 74291
 
5.4%
o 73856
 
5.4%
d 66662
 
4.9%
t 65823
 
4.8%
u 54960
 
4.0%
51430
 
3.7%
Other values (103) 472883
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1084612
79.0%
Uppercase Letter 208552
 
15.2%
Space Separator 51430
 
3.7%
Dash Punctuation 26174
 
1.9%
Other Punctuation 1768
 
0.1%
Close Punctuation 12
 
< 0.1%
Open Punctuation 12
 
< 0.1%
Modifier Symbol 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 231846
21.4%
r 105670
9.7%
l 90239
 
8.3%
n 84908
 
7.8%
e 74291
 
6.8%
o 73856
 
6.8%
d 66662
 
6.1%
t 65823
 
6.1%
u 54960
 
5.1%
i 49062
 
4.5%
Other values (60) 187295
17.3%
Uppercase Letter
ValueCountFrequency (%)
B 27853
13.4%
H 21609
10.4%
N 19857
9.5%
J 19737
9.5%
S 16349
 
7.8%
U 13543
 
6.5%
Z 13026
 
6.2%
T 11771
 
5.6%
M 8804
 
4.2%
G 7769
 
3.7%
Other values (23) 48234
23.1%
Other Punctuation
ValueCountFrequency (%)
. 1179
66.7%
' 383
 
21.7%
/ 159
 
9.0%
, 33
 
1.9%
! 14
 
0.8%
Space Separator
ValueCountFrequency (%)
51430
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26174
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1293164
94.2%
Common 79404
 
5.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 231846
17.9%
r 105670
 
8.2%
l 90239
 
7.0%
n 84908
 
6.6%
e 74291
 
5.7%
o 73856
 
5.7%
d 66662
 
5.2%
t 65823
 
5.1%
u 54960
 
4.3%
i 49062
 
3.8%
Other values (93) 395847
30.6%
Common
ValueCountFrequency (%)
51430
64.8%
- 26174
33.0%
. 1179
 
1.5%
' 383
 
0.5%
/ 159
 
0.2%
, 33
 
< 0.1%
! 14
 
< 0.1%
) 12
 
< 0.1%
( 12
 
< 0.1%
` 8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1361854
99.2%
None 10700
 
0.8%
Latin Ext Additional 14
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 231846
17.0%
r 105670
 
7.8%
l 90239
 
6.6%
n 84908
 
6.2%
e 74291
 
5.5%
o 73856
 
5.4%
d 66662
 
4.9%
t 65823
 
4.8%
u 54960
 
4.0%
51430
 
3.8%
Other values (52) 462169
33.9%
None
ValueCountFrequency (%)
â 4986
46.6%
á 1003
 
9.4%
í 907
 
8.5%
é 873
 
8.2%
ó 401
 
3.7%
ð 351
 
3.3%
ä 236
 
2.2%
ö 232
 
2.2%
ã 195
 
1.8%
ý 169
 
1.6%
Other values (36) 1347
 
12.6%
Latin Ext Additional
ValueCountFrequency (%)
8
57.1%
3
 
21.4%
1
 
7.1%
ế 1
 
7.1%
1
 
7.1%

level2Gid
Text

Missing 

Distinct4380
Distinct (%)3.4%
Missing161386
Missing (%)55.5%
Memory size2.2 MiB
2025-01-08T18:40:37.191153image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length10
Mean length9.976110322
Min length7

Characters and Unicode

Total characters1292026
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1433 ?
Unique (%)1.1%

Sample

1st rowNLD.14.43_1
2nd rowAUS.8.23_1
3rd rowGMB.4.1_1
4th rowNZL.12.1_1
5th rowMDG.2.1_1
ValueCountFrequency (%)
idn.9.5_1 4808
 
3.7%
idn.9.24_1 2640
 
2.0%
idn.9.16_1 2196
 
1.7%
nld.14.2_1 1727
 
1.3%
idn.9.7_1 1658
 
1.3%
idn.32.4_1 1460
 
1.1%
idn.32.15_1 1399
 
1.1%
nld.9.4_1 1326
 
1.0%
nld.6.1_1 1283
 
1.0%
idn.9.4_1 1268
 
1.0%
Other values (4370) 109747
84.7%
2025-01-08T18:40:37.454121image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 258702
20.0%
1 210783
16.3%
_ 129512
10.0%
N 97521
 
7.5%
D 94264
 
7.3%
2 64997
 
5.0%
L 51564
 
4.0%
I 46832
 
3.6%
4 45478
 
3.5%
9 45448
 
3.5%
Other values (28) 246925
19.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 515758
39.9%
Uppercase Letter 388054
30.0%
Other Punctuation 258702
20.0%
Connector Punctuation 129512
 
10.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 97521
25.1%
D 94264
24.3%
L 51564
13.3%
I 46832
12.1%
R 12368
 
3.2%
U 11995
 
3.1%
A 11590
 
3.0%
S 11200
 
2.9%
E 6988
 
1.8%
G 5568
 
1.4%
Other values (16) 38164
 
9.8%
Decimal Number
ValueCountFrequency (%)
1 210783
40.9%
2 64997
 
12.6%
4 45478
 
8.8%
9 45448
 
8.8%
3 43448
 
8.4%
5 28694
 
5.6%
6 23220
 
4.5%
8 20113
 
3.9%
7 17487
 
3.4%
0 16090
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 258702
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 129512
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 903972
70.0%
Latin 388054
30.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 97521
25.1%
D 94264
24.3%
L 51564
13.3%
I 46832
12.1%
R 12368
 
3.2%
U 11995
 
3.1%
A 11590
 
3.0%
S 11200
 
2.9%
E 6988
 
1.8%
G 5568
 
1.4%
Other values (16) 38164
 
9.8%
Common
ValueCountFrequency (%)
. 258702
28.6%
1 210783
23.3%
_ 129512
14.3%
2 64997
 
7.2%
4 45478
 
5.0%
9 45448
 
5.0%
3 43448
 
4.8%
5 28694
 
3.2%
6 23220
 
2.6%
8 20113
 
2.2%
Other values (2) 33577
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1292026
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 258702
20.0%
1 210783
16.3%
_ 129512
10.0%
N 97521
 
7.5%
D 94264
 
7.3%
2 64997
 
5.0%
L 51564
 
4.0%
I 46832
 
3.6%
4 45478
 
3.5%
9 45448
 
3.5%
Other values (28) 246925
19.1%

level2Name
Text

Missing 

Distinct4256
Distinct (%)3.3%
Missing161392
Missing (%)55.5%
Memory size2.2 MiB
2025-01-08T18:40:37.640789image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length28
Mean length9.585540438
Min length2

Characters and Unicode

Total characters1241385
Distinct characters144
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1354 ?
Unique (%)1.0%

Sample

1st rowLisse
2nd rowKangaroo Island
3rd rowCentral Baddibu
4th rowCentral Otago
5th rowDiana
ValueCountFrequency (%)
bogor 7004
 
4.1%
kota 4404
 
2.6%
sukabumi 2674
 
1.6%
manggarai 2078
 
1.2%
de 1744
 
1.0%
s-gravenhage 1727
 
1.0%
serdang 1718
 
1.0%
barat 1713
 
1.0%
cianjur 1658
 
1.0%
tengah 1627
 
1.0%
Other values (4584) 143265
84.5%
2025-01-08T18:40:37.891266image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 152288
 
12.3%
e 125630
 
10.1%
n 86546
 
7.0%
r 83679
 
6.7%
o 71740
 
5.8%
i 59435
 
4.8%
t 50086
 
4.0%
l 47691
 
3.8%
u 47166
 
3.8%
g 43531
 
3.5%
Other values (134) 473593
38.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1016711
81.9%
Uppercase Letter 171920
 
13.8%
Space Separator 40106
 
3.2%
Dash Punctuation 7908
 
0.6%
Other Punctuation 3873
 
0.3%
Decimal Number 366
 
< 0.1%
Open Punctuation 240
 
< 0.1%
Close Punctuation 236
 
< 0.1%
Modifier Symbol 24
 
< 0.1%
Initial Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 152288
15.0%
e 125630
12.4%
n 86546
 
8.5%
r 83679
 
8.2%
o 71740
 
7.1%
i 59435
 
5.8%
t 50086
 
4.9%
l 47691
 
4.7%
u 47166
 
4.6%
g 43531
 
4.3%
Other values (69) 248919
24.5%
Uppercase Letter
ValueCountFrequency (%)
B 22354
 
13.0%
S 16829
 
9.8%
M 14214
 
8.3%
K 12909
 
7.5%
T 10498
 
6.1%
H 8804
 
5.1%
C 7669
 
4.5%
A 7609
 
4.4%
D 7480
 
4.4%
L 7475
 
4.3%
Other values (33) 56079
32.6%
Decimal Number
ValueCountFrequency (%)
1 138
37.7%
0 63
17.2%
7 39
 
10.7%
6 36
 
9.8%
2 27
 
7.4%
3 25
 
6.8%
4 20
 
5.5%
5 13
 
3.6%
9 3
 
0.8%
8 2
 
0.5%
Other Punctuation
ValueCountFrequency (%)
' 2148
55.5%
. 1637
42.3%
, 60
 
1.5%
/ 24
 
0.6%
& 3
 
0.1%
# 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
40106
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7908
100.0%
Open Punctuation
ValueCountFrequency (%)
( 240
100.0%
Close Punctuation
ValueCountFrequency (%)
) 236
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 24
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1188631
95.8%
Common 52754
 
4.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 152288
 
12.8%
e 125630
 
10.6%
n 86546
 
7.3%
r 83679
 
7.0%
o 71740
 
6.0%
i 59435
 
5.0%
t 50086
 
4.2%
l 47691
 
4.0%
u 47166
 
4.0%
g 43531
 
3.7%
Other values (112) 420839
35.4%
Common
ValueCountFrequency (%)
40106
76.0%
- 7908
 
15.0%
' 2148
 
4.1%
. 1637
 
3.1%
( 240
 
0.5%
) 236
 
0.4%
1 138
 
0.3%
0 63
 
0.1%
, 60
 
0.1%
7 39
 
0.1%
Other values (12) 179
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1234475
99.4%
None 6826
 
0.5%
IPA Ext 58
 
< 0.1%
Latin Ext Additional 25
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 152288
 
12.3%
e 125630
 
10.2%
n 86546
 
7.0%
r 83679
 
6.8%
o 71740
 
5.8%
i 59435
 
4.8%
t 50086
 
4.1%
l 47691
 
3.9%
u 47166
 
3.8%
g 43531
 
3.5%
Other values (63) 466683
37.8%
None
ValueCountFrequency (%)
â 1341
19.6%
á 1059
15.5%
ú 785
11.5%
é 654
9.6%
ó 431
 
6.3%
í 407
 
6.0%
ð 278
 
4.1%
ö 253
 
3.7%
è 218
 
3.2%
ä 182
 
2.7%
Other values (54) 1218
17.8%
IPA Ext
ValueCountFrequency (%)
ə 58
100.0%
Latin Ext Additional
ValueCountFrequency (%)
13
52.0%
5
 
20.0%
3
 
12.0%
3
 
12.0%
1
 
4.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

level3Gid
Text

Missing 

Distinct3681
Distinct (%)5.8%
Missing227914
Missing (%)78.3%
Memory size2.2 MiB
2025-01-08T18:40:38.100291image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15
Median length14
Mean length12.13957513
Min length9

Characters and Unicode

Total characters764599
Distinct characters36
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1291 ?
Unique (%)2.0%

Sample

1st rowMDG.2.1.5_1
2nd rowBEL.2.1.3_1
3rd rowIDN.18.1.5_1
4th rowIDN.19.9.2_1
5th rowIDN.19.6.1_1
ValueCountFrequency (%)
idn.9.5.3_1 2876
 
4.6%
idn.9.4.13_1 1247
 
2.0%
idn.9.7.13_1 1190
 
1.9%
idn.21.9.5_1 848
 
1.3%
idn.9.16.5_1 799
 
1.3%
idn.9.16.3_1 763
 
1.2%
idn.29.9.7_1 672
 
1.1%
idn.19.6.5_1 644
 
1.0%
idn.9.16.1_1 612
 
1.0%
idn.9.24.5_1 548
 
0.9%
Other values (3671) 52785
83.8%
2025-01-08T18:40:38.369670image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 188948
24.7%
1 127130
16.6%
_ 62984
 
8.2%
N 47817
 
6.3%
D 47033
 
6.2%
I 45947
 
6.0%
2 43506
 
5.7%
3 33491
 
4.4%
9 30877
 
4.0%
5 21788
 
2.8%
Other values (26) 115078
15.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 324197
42.4%
Other Punctuation 188948
24.7%
Uppercase Letter 188470
24.6%
Connector Punctuation 62984
 
8.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 47817
25.4%
D 47033
25.0%
I 45947
24.4%
R 5081
 
2.7%
E 5015
 
2.7%
A 4620
 
2.5%
U 3451
 
1.8%
B 3258
 
1.7%
C 3078
 
1.6%
L 3032
 
1.6%
Other values (14) 20138
10.7%
Decimal Number
ValueCountFrequency (%)
1 127130
39.2%
2 43506
 
13.4%
3 33491
 
10.3%
9 30877
 
9.5%
5 21788
 
6.7%
4 20943
 
6.5%
6 14539
 
4.5%
7 11698
 
3.6%
8 11264
 
3.5%
0 8961
 
2.8%
Other Punctuation
ValueCountFrequency (%)
. 188948
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 62984
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 576129
75.4%
Latin 188470
 
24.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 47817
25.4%
D 47033
25.0%
I 45947
24.4%
R 5081
 
2.7%
E 5015
 
2.7%
A 4620
 
2.5%
U 3451
 
1.8%
B 3258
 
1.7%
C 3078
 
1.6%
L 3032
 
1.6%
Other values (14) 20138
10.7%
Common
ValueCountFrequency (%)
. 188948
32.8%
1 127130
22.1%
_ 62984
 
10.9%
2 43506
 
7.6%
3 33491
 
5.8%
9 30877
 
5.4%
5 21788
 
3.8%
4 20943
 
3.6%
6 14539
 
2.5%
7 11698
 
2.0%
Other values (2) 20225
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 764599
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 188948
24.7%
1 127130
16.6%
_ 62984
 
8.2%
N 47817
 
6.3%
D 47033
 
6.2%
I 45947
 
6.0%
2 43506
 
5.7%
3 33491
 
4.4%
9 30877
 
4.0%
5 21788
 
2.8%
Other values (26) 115078
15.1%

level3Name
Text

Missing 

Distinct3486
Distinct (%)5.7%
Missing229332
Missing (%)78.8%
Memory size2.2 MiB
2025-01-08T18:40:38.567455image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length32
Median length27
Mean length9.412143066
Min length2

Characters and Unicode

Total characters579468
Distinct characters110
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1204 ?
Unique (%)2.0%

Sample

1st rowNosibe
2nd rowTurnhout
3rd rowJailolo
4th rowKairatu
5th rowAmahai
ValueCountFrequency (%)
caringin 3007
 
3.4%
barat 2340
 
2.7%
bogor 2175
 
2.5%
utara 1682
 
1.9%
tengah 1483
 
1.7%
muara 1352
 
1.5%
selatan 1306
 
1.5%
gembong 1247
 
1.4%
n.a 1246
 
1.4%
cipanas 1191
 
1.4%
Other values (3808) 70921
80.6%
2025-01-08T18:40:38.829073image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 90654
15.6%
n 47306
 
8.2%
i 40003
 
6.9%
r 33679
 
5.8%
e 31527
 
5.4%
o 30937
 
5.3%
u 29442
 
5.1%
26384
 
4.6%
g 26350
 
4.5%
t 19974
 
3.4%
Other values (100) 203212
35.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 455743
78.6%
Uppercase Letter 85691
 
14.8%
Space Separator 26384
 
4.6%
Decimal Number 3871
 
0.7%
Other Punctuation 3280
 
0.6%
Dash Punctuation 1588
 
0.3%
Open Punctuation 1493
 
0.3%
Close Punctuation 1418
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 90654
19.9%
n 47306
10.4%
i 40003
8.8%
r 33679
 
7.4%
e 31527
 
6.9%
o 30937
 
6.8%
u 29442
 
6.5%
g 26350
 
5.8%
t 19974
 
4.4%
l 17227
 
3.8%
Other values (47) 88644
19.5%
Uppercase Letter
ValueCountFrequency (%)
S 10208
11.9%
B 10163
11.9%
C 9261
10.8%
T 7994
9.3%
P 6383
 
7.4%
M 5977
 
7.0%
K 5833
 
6.8%
L 4110
 
4.8%
G 3699
 
4.3%
A 3157
 
3.7%
Other values (22) 18906
22.1%
Decimal Number
ValueCountFrequency (%)
1 869
22.4%
2 826
21.3%
3 416
10.7%
8 396
10.2%
0 365
9.4%
4 321
 
8.3%
9 254
 
6.6%
5 165
 
4.3%
7 165
 
4.3%
6 94
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 2918
89.0%
, 127
 
3.9%
' 115
 
3.5%
/ 115
 
3.5%
: 2
 
0.1%
* 2
 
0.1%
! 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
26384
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1588
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1493
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1418
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 541434
93.4%
Common 38034
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 90654
16.7%
n 47306
 
8.7%
i 40003
 
7.4%
r 33679
 
6.2%
e 31527
 
5.8%
o 30937
 
5.7%
u 29442
 
5.4%
g 26350
 
4.9%
t 19974
 
3.7%
l 17227
 
3.2%
Other values (79) 174335
32.2%
Common
ValueCountFrequency (%)
26384
69.4%
. 2918
 
7.7%
- 1588
 
4.2%
( 1493
 
3.9%
) 1418
 
3.7%
1 869
 
2.3%
2 826
 
2.2%
3 416
 
1.1%
8 396
 
1.0%
0 365
 
1.0%
Other values (11) 1361
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 578313
99.8%
None 1139
 
0.2%
Latin Ext Additional 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 90654
15.7%
n 47306
 
8.2%
i 40003
 
6.9%
r 33679
 
5.8%
e 31527
 
5.5%
o 30937
 
5.3%
u 29442
 
5.1%
26384
 
4.6%
g 26350
 
4.6%
t 19974
 
3.5%
Other values (63) 202057
34.9%
None
ValueCountFrequency (%)
ü 192
16.9%
é 165
14.5%
è 113
9.9%
ó 100
8.8%
á 97
8.5%
ã 71
 
6.2%
â 71
 
6.2%
ö 63
 
5.5%
ä 55
 
4.8%
ñ 29
 
2.5%
Other values (24) 183
16.1%
Latin Ext Additional
ValueCountFrequency (%)
13
81.2%
2
 
12.5%
1
 
6.2%

iucnRedListCategory
Text

Missing 

Distinct9
Distinct (%)< 0.1%
Missing167789
Missing (%)57.7%
Memory size2.2 MiB
2025-01-08T18:40:38.885695image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters246218
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLC
2nd rowLC
3rd rowLC
4th rowNT
5th rowVU
ValueCountFrequency (%)
lc 91089
74.0%
ne 17845
 
14.5%
nt 7674
 
6.2%
vu 4076
 
3.3%
en 1736
 
1.4%
cr 526
 
0.4%
ex 130
 
0.1%
dd 25
 
< 0.1%
ew 8
 
< 0.1%
2025-01-08T18:40:38.988132image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 91615
37.2%
L 91089
37.0%
N 27255
 
11.1%
E 19719
 
8.0%
T 7674
 
3.1%
V 4076
 
1.7%
U 4076
 
1.7%
R 526
 
0.2%
X 130
 
0.1%
D 50
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 246218
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 91615
37.2%
L 91089
37.0%
N 27255
 
11.1%
E 19719
 
8.0%
T 7674
 
3.1%
V 4076
 
1.7%
U 4076
 
1.7%
R 526
 
0.2%
X 130
 
0.1%
D 50
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 246218
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 91615
37.2%
L 91089
37.0%
N 27255
 
11.1%
E 19719
 
8.0%
T 7674
 
3.1%
V 4076
 
1.7%
U 4076
 
1.7%
R 526
 
0.2%
X 130
 
0.1%
D 50
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 246218
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 91615
37.2%
L 91089
37.0%
N 27255
 
11.1%
E 19719
 
8.0%
T 7674
 
3.1%
V 4076
 
1.7%
U 4076
 
1.7%
R 526
 
0.2%
X 130
 
0.1%
D 50
 
< 0.1%